Evaluating Storage Data Protection Methods -- Redmondmag.com

Evaluating Storage Data Protection Methods

With many options for those looking to protect enterprise data, here are some of the top methods and why they may or may not be right for you.

By Scott D. Lowe
02/12/2014

In any storage system, one task is paramount over all others. That task is data protection. There is no storage vendor on the planet that wants its customers to fall victim to what are commonly termed as data loss events. In such events, as you might expect, a confluence of factors come together with the end result being the loss of critical company data.

Over the years, there have been a number of different methods implemented in storage systems to help organizations protect against data loss. Over the years, these various methods have had their ups and downs. In this article, you will learn about some of these data protection mechanisms. But, before you do, understand this one key fact: No data protection mechanism can guarantee 100 percent data protection. There is always a point at which data loss will occur. The goal is to choose a data protection method that can withstand reasonable disruption without loss of data. And, always make sure you have good backups!

RAID
RAID. It's been around for a really long time and, over the years, people have come to have a love/hate relationship with it. It's tough to categorize RAID as just a single data protection method, though. After all, there are a number of what have become "standard" RAID levels as well as a number of non-standard or obsolete RAID levels. Further, some vendors have implemented proprietary RAID schemes that don't fit any existing categories.

Different RAID levels work in different ways:

RAID 0. Although this level has "RAID" in it, it actually offers no data protection. This method simply stripes data across all of the available disks in the disk set, improving overall performance. As such, RAID 0 is purely a performance boost.
RAID 1, RAID 10. RAID levels 1 and 10 implement a mirroring mechanism whereby data is written to multiple drives in the array. These RAID level require 100 percent capacity overhead in order to operate.
RAID 5, RAID 6, RAID 50, RAID 60. These RAID level stripe data across all available disks, but then also write parity information to disks in the array as well. This parity information is calculated by the RAID controller and is used to recover data in the event that up to one (RAID 5/50) or two (RAID 6/60) disks in the array are lost.

Regardless of the RAID level, RAID is implemented in one of two ways: Most commonly, organizations opt for hardware-based add-in RAID adapters to which the organization's storage is connected. Such cards can be installed in servers or can be used in an external storage array to provide data protection benefits wherever storage happens to reside.

In recent years, common RAID levels have come under fire as disks get bigger. The larger the disk, the longer it takes to rebuild the data. With RAID 5, for example, it can take days to rebuild a lost disk. During this time, the entire array is at risk; the loss of an additional disk during rebuild means the loss of all data on the array. It is for this reason that many organizations are moving to RAID 6, which can withstand the loss of two disks, or even RAID 10 mirroring.

Erasure codes
You normally wouldn't associate data protection with a mechanism that has the word "erasure" in its title. However, erasure codes -- also known as forward error correction -- is a data protection mechanism. Erasure codes work by breaking data objects into small fragments. Each fragment is stored independently of the others and data can be recovered using any combination of these smaller chunks of data.

An erasure code provides redundancy by breaking objects up into smaller fragments and storing the fragments in different places. The key is that you can recover the data from any combination of a smaller number of those fragments.

In short, here's a quick look at how erasure codes work their magic:

Data is broken down into m fragments.
The data is then recoded into n fragments. Here, n is always greater than m, so there are a greater number of recoded fragments than original ones. The greater the number of recoded fragments, the better the data protection. Bear in mind that data can be reconstructed from any of these fragments.
The amount of storage required is (1/(m/n)). So suppose you have ten m fragments and choose to recode into sixteen n fragments. Doing the math, we get (1/(10/16)), which results in a data encoding rate of 1.6x the amount of stored data.

I've not used erasure code-based systems before. However, my colleague, David Floyer at Wikibon, is an expert in such things. He has written about erasure codes here.

Data Replication
RAID has a lot of overhead. Between needing to calculate parity -- a processor-intensive activity -- and store extra data, organizations give up both processing power and disk capacity in order to achieve data protection needs. In order to eliminate RAID from the equation, as well as the associated parity calculation cost, many emerging systems are turning to data replication-based systems for data protection. After all, hard disks have become really cheap, so why not leverage their cost effectiveness and simply store copies of data at various locations in the array.

Data replication is also increasingly popular in many of today's hottest scale out cluster-based storage products. It's easier to manage data copies in a cluster than it is in RAID data. For most system, the replication factor -- the number of copies of data that is stored -- is configurable. The more copies of the data, the better protected the data.

If disk prices were to jump again, I would expect to see this method decrease in popularity. For now, though, there are systems on the market that by default use a replication factor of three. When you consider that we now have 6 TB drives on the market, even having only 1/3 of the actual capacity available isn't really all that bad.

About the Author

Scott D. Lowe is the founder and managing consultant of The 1610 Group, a strategic and tactical IT consulting firm based in the Midwest. Scott has been in the IT field for close to 20 years and spent 10 of those years in filling the CIO role for various organizations. He's also either authored or co-authored four books and is the creator of 10 video training courses for TrainSignal.

Featured

Microsoft Expands Defender Experts With New Threat Intelligence and Multicloud Coverage

Microsoft on Wednesday introduced a threat intelligence service and expanded its managed detection and response offering as the company looks to help security teams face growing volume of threat data into specific defensive actions.
What Happens When Malware Outlives its Intended Lifespan, Part 1?

Aging malware can remain dangerous long after its creators move on, leaving victims with fewer protections and no reliable recovery path.
Microsoft, 3M Partnership Targets AI Infrastructure and Enterprise Transformation

Microsoft and 3M on Wednesday announced a wide-ranging partnership that links two major areas of enterprise AI investment: the infrastructure needed to support AI data centers and the use of AI to modernize large organizations.
Microsoft's Record July Patch Tuesday Fixes 570 Flaws, Including Two Exploited Zero-Days

Microsoft's July Patch Tuesday release broke the record for a second straight month, delivering fixes for roughly 570 holes across Windows, SharePoint, Microsoft 365, Azure and others.
Why Most Backup Success Metrics Are Meaningless

Traditional backup metrics can show perfect health while failing to reveal whether critical workloads can actually be restored.

comments powered by Disqus

Subscribe on YouTube

Office 365 Watch

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 3-7, 2026

Virtual Hands-on Training Seminar: PowerShell Mastery Workshop: From Fundamentals to Advanced Automation
September 9-10, 2026

The AI Pivot
September 25, 2026

Live! 360 6-Week Training & Certification Course: Mastering the Microsoft AI Framework: Building Enterprise-Ready AI Agents with Microsoft Foundry
October 6–November 10, 2026

Live! 360 Orlando
November 15-20, 2026

Artificial Intelligence Live! Orlando
November 15-20, 2026

AI Enterprise Architecture Live! Orlando
November 15-20, 2026

Cybersecurity & Ransomware Live! Orlando
November 15-20, 2026

Data Platform Live! Orlando
November 15-20, 2026

TechMentor Orlando
November 15-20, 2026

Live! 360 2-Day Hands-On Seminar: AI-Powered .NET Development with Claude & Claude Code
December 8-9, 2026

Virtual Hands-on Training Seminar: AI-Powered PowerShell and Infrastructure Automation with Claude Code
December 10-11, 2026

TechMentor & Cybersecurity Live! @ Microsoft HQ
August 9-13, 2027

Webcasts

More Webcasts

Whitepapers

More Tech Library