5... 4... 3... 2... Ignition... We Have: AWS EC2 Persistent Storage Amazon Elastic Block Store (EBS) Liftoff!

By M. David Peterson
August 20, 2008
chp_rocket.jpg

wernermajorvogels: @jeffbarr is charge of the countdown :-)

jeffmajorbarr: Getting ready for liftoff - topping off the tanks.

jeffmajorbarr: "Hey dudes, get the heck off of my launch pad or you'll be roasted."

jeffmajorbarr: Making ritual stop near rear wheel of bus. Ahhh, much better.

jeffmajorbarr: Climbing into the capsule.

jeffmajorbarr: @mndoci - climb aboard...

jeffmajorbarr: Seat belts buckled. Capsule door sealed. All systems nominal.

jeffmajorbarr: Countdown is well underway...

jeffmajorbarr: Looking good, looking good, but can't tell you what I see yet.

jeffmajorbarr: Mere minutes to launch...

jeffmajorbarr: Amazon Elastic Block Store: Bring Us Your Bits

A few months ago I talked about our plans to offer a persistent storage feature for Amazon EC2. At that time I indicated that the service was in a limited alpha release with a small number of customers. Since then the alpha testers have been putting the service to good use and have provided us with a lot of very helpful feedback.

As of today, the Amazon Elastic Block Store (EBS) is now open and available to all EC2 users.

jeffmajorbarr: Whew!

I'd like to second Jeff's "Whew!" and add a "WOOOOHOOOOOO!!!!" of my own. :D That was fun! More info at Jeff's post linked to above, on the AWS landing page, on Werner Vogel's blog, and inline to the bottom of this post.

So obviously this is pretty big news. So big in fact you might find yourself asking "Is it true that Amazon EBS cures Cancer?" which is just one of several myths I made attempt to debunk a few weeks back** in preparation for todays release of Amazon EBS. ;-)

NOTE: For those of you interested in extending the functionality of EBS to include automatic failover to a new EC2 instance if and when an EC2 instance with a mounted EBS volume goes down I'm working on updating my "Preparing for EC2 Persistent Storage" white paper to use EBS instead of DRBD-backed ephemeral devices. Once updated I'll make a new post and update this post with a link.

In the mean time, go enjoy your bright and shiny new toy! I assure you... You're going to love it!

Oh, and the forum-based support for EBS will be taking place in the pre-existing EC2 forum, which obviously makes sense. The updated EC2 commandline tools provide access to the new EC2+EBS API, and ElasticFox has been updated with all sorts of wonderful new features that bring the benefits of EBS to within a simple click of your mouse (personally I prefer the commandline, but to each his/her own ;-), and in this particular case, the updated ElasticFox is going to blow you away with how good it's become.)

So y'all ready for all the Amazon EBS chewey gooey goodness you could possibly handle? FTW! :-)

Amazon Elastic Block Store (EBS)

Amazon Elastic Block Store (EBS) provides block level storage volumes for use with Amazon EC2 instances. Amazon EBS volumes are off-instance storage that persists independently from the life of an instance. Amazon Elastic Block Store provides highly available, highly reliable storage volumes that can be attached to a running Amazon EC2 instance and exposed as a device within the instance. Amazon EBS is particularly suited for applications that require a database, file system, or access to raw block level storage.


Features of Amazon EBS volumes:

• Amazon EBS allows you to create storage volumes from 1 GB to 1 TB that can be mounted as devices by Amazon EC2 instances. Multiple volumes can be mounted to the same instance.
• Storage volumes behave like raw, unformatted block devices, with user supplied device names and a block device interface. You can create a file system on top of Amazon EBS volumes, or use them in any other way you would use a block device (like a hard drive).
• Amazon EBS volumes are placed in a specific Availability Zone, and can then be attached to instances also in that same Availability Zone.
• Each storage volume is automatically replicated within the same Availability Zone. This prevents data loss due to failure of any single hardware component.
• Amazon EBS also provides the ability to create point-in-time snapshots of volumes, which are persisted to Amazon S3. These snapshots can be used as the starting point for new Amazon EBS volumes, and protect data for long-term durability. The same snapshot can be used to instantiate as many volumes as you wish.


Using Amazon EBS Volumes

Amazon EBS volumes are created in a particular Availability Zone and can be from 1 GB to 1 TB in size. Once a volume is created, it can be attached to any Amazon EC2 instance in the same Availability Zone. Once attached, it will appear as a mounted device similar to any hard drive or other block device. At that point, the instance can interact with the volume just as it would with a local drive, formatting it with a file system or installing applications on it directly.

A volume can only be attached to one instance at a time, but many volumes can be attached to a single instance. This means that you can attach multiple volumes and stripe your data across them for increased I/O and throughput performance. This is particularly helpful for database style applications that frequently encounter many random reads and writes across the dataset. If an instance fails or is detached from an Amazon EBS volume, the volume can be attached to any other instance in that Availability Zone.


Amazon EBS Snapshots

Amazon EBS provides the ability to back up point-in-time snapshots of your data to Amazon S3 for durable recovery. Amazon EBS snapshots are differential backups, meaning that only the blocks on the device that have changed since your last snapshot will be incrementally saved. This means that if you have a device with 100 GBs of data, but only 5 GBs of data has changed since your last snapshot, only the 5 additional GBs of snapshot data will be stored back to Amazon S3.
Snapshots can also be used to instantiate multiple new volumes, expand the size of a volume or move volumes across Availability Zones. When a new volume is created, there is the option to create it based on an existing Amazon S3 snapshot. In that scenario, the new volume begins as an exact replica of the original volume. By optionally specifying a different volume size or a different Availability Zone, this functionality can be used as a way to increase the size of an existing volume or to create duplicate volumes in new Availability Zones. If you chose to use snapshots to resize your volume, you need to be sure your file system or application supports resizing a device.

New volumes created from existing Amazon S3 snapshots load lazily in the background. This means that once a volume is created from a snapshot, there is no need to wait for all of the data to transfer from Amazon S3 to your Amazon EBS volume before your attached instance can start accessing the volume and all of its data. If your instance accesses a piece of data which hasn't yet been loaded, the volume will immediately download the requested data from Amazon S3, and then will continue loading the rest of the volume's data in the background.


Amazon EBS Volume Performance

The latency and throughput of Amazon EBS volumes is designed to be significantly better than the Amazon EC2 instance stores in nearly all cases. You can also attach multiple volumes to an instance and stripe across the volumes. This is one way to improve I/O rates, especially if your application performs a lot of random access across your data set.

The exact performance will depend on the application (e.g. random vs. sequential I/O or large vs. small request sizes), so the best measure is to benchmark your real applications against the volume. Because Amazon EBS volumes require network access, you will see faster and more consistent throughput performance with larger instances.


Amazon EBS Volume Durability

Amazon EBS volumes are designed to be highly available and reliable. Amazon EBS volume data is replicated across multiple servers in an Availability Zone to prevent the loss of data from the failure of any single component. The durability of your volume depends both on the size of your volume and the percentage of the data that has changed since your last snapshot. As an example, volumes that operate with 20 GB or less of modified data since their most recent Amazon EBS snapshot can expect an annual failure rate (AFR) of between 0.1% - 0.5%, where failure refers to a complete loss of the volume. This compares with commodity hard disks that will typically fail with an AFR of around 4%, making EBS volumes 10 times more reliable than typical commodity disk drives.

Because Amazon EBS servers are replicated within a single Availability Zone, mirroring data across multiple Amazon EBS volumes in the same Availability Zone will not significantly improve volume durability. However, for those interested in even more durability, Amazon EBS provides the ability to create point-in-time consistent snapshots of your volumes that are then stored in Amazon S3, and automatically replicated across multiple Availability Zones. So, taking frequent snapshots of your volume is a convenient and cost effective way to increase the long term durability of your data. In the unlikely event that your Amazon EBS volume does fail, all snapshots of that volume will remain intact, and will allow you to recreate your volume from the last snapshot point.


Projecting Costs

With Amazon Elastic Block Store, you only pay for what you use. Volume storage is charged by the amount you allocate until you release it, and is priced at a rate of $0.10 per allocated GB per month.

Amazon EBS also charges $0.10 per 1 million I/O requests you make to your volume. Programs like IOSTAT can be used to measure the exact I/O usage of your system at any time. However, applications and operating systems often do different levels of caching, so you will likely see a lower number of I/O requests on your bill than is seen by your application unless you sync all of your I/Os to disk.

As an example, a medium sized website database might be 100 GB in size and expect to average 100 I/Os per second over the course of a month. This would translate to $10 per month in storage costs (100 GB x $0.10/month), and approximately $26 per month in request costs (~2.6 million seconds/month x 100 I/O per second * $0.10 per million I/O).

Snapshot storage is based on the amount of space your data consumes in Amazon S3. Because data is compressed before being saved to Amazon S3, and Amazon EBS does not save empty blocks, it is likely that the size of a snapshot will be considerably less than the size of your volume. For the first snapshot of a volume, Amazon EBS will save a full copy of your data to Amazon S3. However for each incremental snapshot, only the part of your Amazon EBS volume that has been changed will be saved to Amazon S3.

Volume data is broken up into chunks before being transferred to Amazon S3. While the size of the chunks could change through future optimizations, the number of PUTs required to save a particular snapshot to Amazon S3 can be estimated by dividing the size of the data that has changed since the last snapshot by 4MB. Conversely, when loading a snapshot from Amazon S3 into and Amazon EBS volume, the number of GET requests needed to fully load the volume can be estimated by dividing the full size of the snapshot by 4MB. You will also be charged for GETs and PUTs at normal Amazon S3 rates.

--
** Updated with proper TLA and moved to Google Docs


You might also be interested in:


Popular Topics

Archives

Or, visit our complete archives.

Recommended for You

Got a Question?