Random Access

Evaluating Storage Data Protection Methods

With many options for those looking to protect enterprise data, here are some of the top methods and why they may or may not be right for you.

In any storage system, one task is paramount over all others. That task is data protection. There is no storage vendor on the planet that wants its customers to fall victim to what are commonly termed as data loss events. In such events, as you might expect, a confluence of factors come together with the end result being the loss of critical company data.

Over the years, there have been a number of different methods implemented in storage systems to help organizations protect against data loss. Over the years, these various methods have had their ups and downs. In this article, you will learn about some of these data protection mechanisms. But, before you do, understand this one key fact: No data protection mechanism can guarantee 100 percent data protection. There is always a point at which data loss will occur. The goal is to choose a data protection method that can withstand reasonable disruption without loss of data. And, always make sure you have good backups!

RAID
RAID. It's been around for a really long time and, over the years, people have come to have a love/hate relationship with it. It's tough to categorize RAID as just a single data protection method, though. After all, there are a number of what have become "standard" RAID levels as well as a number of non-standard or obsolete RAID levels. Further, some vendors have implemented proprietary RAID schemes that don't fit any existing categories.

Different RAID levels work in different ways:

  • RAID 0. Although this level has "RAID" in it, it actually offers no data protection. This method simply stripes data across all of the available disks in the disk set, improving overall performance. As such, RAID 0 is purely a performance boost.
  • RAID 1, RAID 10. RAID levels 1 and 10 implement a mirroring mechanism whereby data is written to multiple drives in the array. These RAID level require 100 percent capacity overhead in order to operate.
  • RAID 5, RAID 6, RAID 50, RAID 60. These RAID level stripe data across all available disks, but then also write parity information to disks in the array as well. This parity information is calculated by the RAID controller and is used to recover data in the event that up to one (RAID 5/50) or two (RAID 6/60) disks in the array are lost.

Regardless of the RAID level, RAID is implemented in one of two ways: Most commonly, organizations opt for hardware-based add-in RAID adapters to which the organization's storage is connected. Such cards can be installed in servers or can be used in an external storage array to provide data protection benefits wherever storage happens to reside.

In recent years, common RAID levels have come under fire as disks get bigger. The larger the disk, the longer it takes to rebuild the data. With RAID 5, for example, it can take days to rebuild a lost disk. During this time, the entire array is at risk; the loss of an additional disk during rebuild means the loss of all data on the array. It is for this reason that many organizations are moving to RAID 6, which can withstand the loss of two disks, or even RAID 10 mirroring.

Erasure codes
You normally wouldn't associate data protection with a mechanism that has the word "erasure" in its title. However, erasure codes -- also known as forward error correction -- is a data protection mechanism. Erasure codes work by breaking data objects into small fragments. Each fragment is stored independently of the others and data can be recovered using any combination of these smaller chunks of data.

An erasure code provides redundancy by breaking objects up into smaller fragments and storing the fragments in different places. The key is that you can recover the data from any combination of a smaller number of those fragments.

In short, here's a quick look at how erasure codes work their magic:

  • Data is broken down into m fragments.
  • The data is then recoded into n fragments. Here, n is always greater than m, so there are a greater number of recoded fragments than original ones. The greater the number of recoded fragments, the better the data protection. Bear in mind that data can be reconstructed from any of these fragments.
  • The amount of storage required is (1/(m/n)). So suppose you have ten m fragments and choose to recode into sixteen n fragments. Doing the math, we get (1/(10/16)), which results in a data encoding rate of 1.6x the amount of stored data.

I've not used erasure code-based systems before. However, my colleague, David Floyer at Wikibon, is an expert in such things. He has written about erasure codes here.

Data Replication
RAID has a lot of overhead. Between needing to calculate parity -- a processor-intensive activity -- and store extra data, organizations give up both processing power and disk capacity in order to achieve data protection needs. In order to eliminate RAID from the equation, as well as the associated parity calculation cost, many emerging systems are turning to data replication-based systems for data protection. After all, hard disks have become really cheap, so why not leverage their cost effectiveness and simply store copies of data at various locations in the array.

Data replication is also increasingly popular in many of today's hottest scale out cluster-based storage products. It's easier to manage data copies in a cluster than it is in RAID data. For most system, the replication factor -- the number of copies of data that is stored -- is configurable. The more copies of the data, the better protected the data.

If disk prices were to jump again, I would expect to see this method decrease in popularity. For now, though, there are systems on the market that by default use a replication factor of three. When you consider that we now have 6 TB drives on the market, even having only 1/3 of the actual capacity available isn't really all that bad.

About the Author

Scott D. Lowe is the founder and managing consultant of The 1610 Group, a strategic and tactical IT consulting firm based in the Midwest. Scott has been in the IT field for close to 20 years and spent 10 of those years in filling the CIO role for various organizations. He's also either authored or co-authored four books and is the creator of 10 video training courses for TrainSignal.


Featured

comments powered by Disqus

Subscribe on YouTube