In-Depth

Public Cloud Test Drive: Amazon S3, Windows Azure & Rackspace Cloud Files

Your quick guide to using these three solutions to protect and recover your enterprise data in the public cloud.

Public cloud services are increasingly becoming feasible targets for backup and recovery of enterprise data. The biggest reason to consider a cloud services provider is for the disbursement of resources.

Large cloud providers have geographically distributed and redundant datacenters spread throughout the world. The global availability and failover provided to customers by these providers, combined with the affordability of such services, are creating a shift in which moving backup to the cloud is becoming an economically feasible alternative to backing up to tape.

Cloud backup solves the off-site storage issue many times over. Not only does it move the data off-site, thereby pro­viding disaster recovery, but using cloud services for backup and recovery also fails-over around the world in almost every case (depending upon the provider's geographic reach).

The most widely used public cloud services provider is Amazon Web Services Inc., while Microsoft and Rackspace US Inc. offer cloud compute and storage services as well. There are many other major providers, including AT&T, Hewlett-Packard Co., IBM Corp., Verizon and virtually thousands of others; some are regional, while others are global.

Given the wide usage by enterprises of cloud services from Amazon, Microsoft and Rackspace, in particular, I decided to test drive and evaluate how suitable they are for backup and recovery.

Microsoft Windows Azure
Many of the Microsoft offerings you now run in your on-premises datacenters are becoming available as part of the Windows Azure portfolio of services, which includes SQL Server, SharePoint and Active Directory (in the pipeline). In addition to these offerings, Microsoft also lets customers store data on the Windows Azure service. In many ways it's like SkyDrive for enterprises. While SharePoint 2013 and the business editions of Office 2013 offer SkyDrive Pro, it's different from the storage services offered via Windows Azure.

I reference SkyDrive to point out that Windows Azure is many things above and beyond storage, and these things are growing all the time.

To get started, Microsoft allows a 90-day free trial with a credit card. The sign-up process takes a couple of minutes, but it's quite simple. When you first set up the trial, your Microsoft account is checked to make sure it's eligible for the trial. Then you see the screen explaining what you get as part of the Windows Azure service (see Figure 1).


[Click on image for larger view.]
Figure 1. The Microsoft Windows Azure features that are included in the free 90-day trial.

The storage portion of the Windows Azure trial allows for 35GB of data storage during the 90-day period. Because most backup data is large even when compressed, I worked with a small subset of data for this evaluation and found the service fairly easy to work with. The provisioning of features within the portal is wizard-driven and will prompt you for needed information, such as the name or URL to use for your storage account. Once you provide a unique name for the base of the URL, the Windows Azure service gets to work and creates the space.

Though the data-entry process for me was fairly painless, I expected it to take a while to provision the storage. It works just like an on-premises storage configuration in that regard: start the provisioning process, get lunch, work on something else for a while, and come back to find the storage ready to go. In this case, the trial provisioning took about four minutes, but naturally the more you store, the longer it will take.

When using Windows Azure storage, access to the service is authenticated using a pair of 512-bit keys to ensure that your data is accessing the proper storage account.

Working with the Windows Azure trial, I quickly found browser-based access is best for managing the service (though not for managing the data). For this I used a third-party tool, CloudBerry Explorer for Windows Azure (it's a free download, which you can obtain here). Once installed, logging into my Windows Azure account with the shared key and account name got me into the service in no time at all.

Microsoft offers geo-replication of your information, whichenables data replication between two sub-regions. In my case this was "uswest" and "useast," providing multiple locations for my information.

As part of the management portal provided with Windows Azure, you can also monitor the service. This will provide information about your storage accounts and containers, including information about geo-replication if enabled.

Correction

Editor’s note: An earlier version of this article erroneously referred to Rackspace Cloud Files as Cloud Block Storage. The author actually tested Cloud Files, an offering more analogous to Amazon S3, but inadvertently referred to it as Cloud Block Storage. This updated version includes pricing for Cloud Files and a corrected screen capture.

Storing Data
Getting files and information into and out of the Windows Azure service is not immediately clear in the browser. There should be a client available from the portal for accessing the service, which I wasn't able to locate. However, the free CloudBerry tool let me write data to the configured space immediately, so this was not a huge problem.

Outside of the need to find a client to access the storage, the service worked very well and was easy to move around in and use. Windows Azure also supports Windows PowerShell cmdlets for managing the service.

Backup Options
Microsoft now offers a significant amount of its product stack in the cloud. That includes its Database as a Service (SQL Database, formerly known as SQL Azure Database) and Windows Azure Active Directory (in beta at press time), as well as VM roles running OSes from Windows Server 2012 and Linux. These services can integrate with your on-premises apps as well.

One of the newest solutions Microsoft has made available for preview is an online backup service that works via an agent deployed in your environment. Using the agent, the backups are written back to the service as they're collected. Storing the snapshots directly in the cloud with Windows Azure removes the need for archiving backup data to the cloud.

I really like the portal that controls the Windows Azure service. Normally this isn't something I'd call out because the configuration screens are generally tolerable, but Microsoft kept things all in one contained space and made navigation and discovery of the tools and portal for all the Windows Azure services easy. Help was also easy to locate.

Once I began placing files into the service, it operated as expected and it was easy to move data into and out of the containers. The easy expansion into other areas of the Microsoft product stack is also something that would be worth considering. Maybe today your organization wants to place a few gigabytes of backup data or limited-use files in the cloud to conserve on-site disks and enable off-site storage, but in the future you might need to look at expanding Active Directory or SQL Server. Windows Azure supports both of these services, and both the quality and quantity of services being added to Windows Azure is growing all the time.

Pricing information beyond the 90-day trial can be found here. Because the service is priced on-demand and charges only for space and resources used, the cost could vary depending on how your organization uses the service. Keep this in mind and consult the calculator to get an understanding of the costs of the Windows Azure service up front.

Amazon S3
Amazon Simple Storage Service (S3) allows any type of storage at a per-gigabyte cost to its customers. Because the service allows you to store any kind of data using a Web browser client or a third-party application, it can be extremely useful for both general data storage and for creating backups. When paired with the Amazon pricing model, customer organizations are charged only for the specific amount of data stored and for the cost of data transfer into the service.

I've used Amazon S3 on a few occasions in the past to speed up the download of larger files for coworkers in remote locations, more for the improved bandwidth than just the storage, but the expe­rience has been a good one.

Amazon S3 stores your data in buckets, which behave like folders within your storage space. The buckets are similar to the containers used by Windows Azure, but were explained a bit better when getting started. The use of separate buckets also allows different security settings to be added. Suppose you created a bucket to contain downloadable customer literature. You could configure the security on the bucket to make it visible to everyone. This way, linking to it from the customer's area of a Web site or in an e-mail would be a simple way to offload the storage of downloadable content.

Creating a separate bucket for your backup files with visibility and access limited to your IT organization allows the files to be offloaded or even archived in the cloud while preventing others from accessing the content.

The S3 service includes a browser-based client for managing buckets as well as file transfer (see Figure 2). Be aware, however, that it's not as user-friendly as any of the third-party clients I've tried. Because the security- and bucket-management features are built-in to these clients, they streamline access considerably while using a key pair to ensure data is secured.


[Click on image for larger view.]
Figure 2. Upload files to Amazon S3 through a simple interface.

Also, like the Windows Azure service, S3 has other counterparts in Amazon cloud services offerings such as its new data archiving service called Glacier. These services allow many similar possibilities for additional compute and database needs.

Amazon S3 is also priced per used-storage quantity at the gigabyte level. MoreC information on the cost of the service can be found at amzn.to/4Dgfzc.

Rackspace Cloud Files
The offering from Rackspace is based on OpenStack, the open source cloud infrastructure project it launched in collaboration with NASA (and which is now being spun off into an independent foundation). With about 200 supporters -- including Cisco Systems Inc., Dell Inc., HP and IBM -- OpenStack-based clouds are designed to ensure portability among public providers as well as private clouds that use the OpenStack software.

I received a 90-day trial of Rackspace Cloud Files -- which, like Windows Azure and Amazon S3, includes block storage and VM compute power. The Rackspace cloud service interface was very intuitive to get started with. It was mostly straightforward and available right after logging in.

Some of the configuration wizards and other in-browser windows never seemed to be on a screen big enough to see all of the options. That's something to consider, because ease of configuration is part of why these cloud services have become so popular.

When creating a storage container, the available datacenters to hold the containers for an account are based on the region the account is created in. When I created storage containers, I was able to place them in Dallas or Chicago. Then, naming the container and clicking "Create Volume" was all that I needed to do to get started (see Figure 3). It's very simple to get off the ground, which is helpful if you need to get things moving quickly.


[Click on image for larger view.]
Figure 3. The Rackspace Cloud Files login interface provides basic configuration choices.

Once the containers were created, I was able to upload files directly using a browser-based client (there was also the option to download a third-party client for multithreaded upload).

In addition to storage containers, Rackspace provides other services including load balancing, DNS servers, VMs and a dedicated backup solution. The Rackspace backup solution uses an agent -- much like the Windows Online Backup solution -- to capture your information directly to cloud-based storage.

Information on pricing for Rackspace Cloud Files can be found here.

Overall Impressions
Amazon seemed easier to use than Windows Azure when I tried to do it through a browser. Because of the nature of operations such as backup, it might be best to consider a third-party client to allow for scheduling, multithreaded file copy and overall ease of use.

With that being the only exception, the services fared about the same in my testing and usage. Because I only used the trial services offered for free and used a limited data size of the same set of files (approximately 25GB of data), the operations were acceptable. If you're working with larger datasets your mileage will vary, and much of the statistics will depend on the speed of the Internet connection being used.

Windows Azure, Amazon S3 and Rackspace Cloud Files are all extremely capable services for parking files, whether primary or backup. The fact that Rackspace included I/O with the cost of the service was definitely a plus. However, the other services were not terribly expensive, especially if the files being placed in the cloud aren't intended to be primary storage. Because the cloud services are sitting on large Internet connections all over the globe, the bandwidth on the remote end should not be an issue.

Consider Data Types
Pay attention to the types of data your organization is backing up to the cloud. Because of network latency and varying speeds of Internet connection, there will be some delay when moving files into and out of the cloud. Because of these issues, and the possible need to recover certain information more regularly (misplaced files, for example), using cloud storage for an archival solution that may not be needed as readily might be the best use case.

However, Internet connections are becoming more reliable and faster. Many organizations are considering redundancy between multiple connections, so using the cloud as a primary storage location might be a viable alternative for some organizations today. For those wanting added redundancy, you may want to use multiple cloud providers as well.

This evaluation covered only the core storage services from each provider, and did not take into account services such as Amazon Glacier.

In addition, all of the providers are working to provide multiple technologies from each cloud platform. Currently, I find the cloud most useful for storage and bandwidth, because it can improve the usability experience for users accessing your content from Internet connections that have different speeds than are available within your organization.

Featured

comments powered by Disqus

Subscribe on YouTube