Backing up data has been a mantra of network administration and security best practices as long there have been networks, administrators or best practices. Events such as the terrorist attacks on Sept. 11, 2001, or the devastation of Hurricane Katrina on Aug. 29, 2005, painfully illustrate the need for having a current backup of data available in case of an emergency.
If the need to prepare for catastrophic events weren’t enough to compel organizations to develop a backup and recovery plan, legislative mandates such as the Sarbanes-Oxley financial and accounting disclosure requirements, and industry initiatives such as the Payment Card industry Data Security Standards require organizations to implement some level of data backup in order to be compliant.
Many organizations have purchased and deployed various data backup technologies and implemented some form of periodic backup schedule only to find that the backup data is somehow corrupt, or that a significant amount of data is not recoverable because of the time between backups. Ideally, you would be able to restore the system to the way it was at the time it crashed rather than having to restore it to a point in time last week or last month, for instance. If a system crashes now, you want to restore it to a state as close to now as possible.
Continuous data protection captures and saves data on a continuous basis. Rather than only performing backups at 2 a.m. on Fridays, for example, CDP solutions automatically save every change made to any data on target systems.
When it is initially deployed, a CDP solution typically will run a complete backup of all data on the target systems and then record any new files or changes to existing files as the data is modified or added to the target system. Rather than recording backup data to slow and cumbersome tape drives, CDP backs up to disk drives.
Restoring data is also much simpler and more efficient with a disk-based CDP solution. Instead of having to sort through backup tapes to find the tape containing the data in question, and then going through the process of mounting the tape and hunting for the data you want restored, a disk-based solution is usually available online in real time, and restoring data is as simple as moving a file from one drive to the next. Some CDP solutions even provide a means for users to access and restore their data via a Web-based interface.
Traditional data backup solutions only perform backups on a scheduled or manual basis. Depending on the software used for backup, it may be possible to schedule automated backups hourly, daily, weekly or on some other customized schedule. Performing the actual backup often devours significant amounts of system resources and network bandwidth, and even an hourly backup has the potential to lose up to an hour of data, which may be unacceptable for some businesses that might process thousands of transactions an hour.
Traditional data backups also consume a lot of space. Most solutions provide the ability to perform incremental backups that only capture new or changed information rather than backing up all of the data each time. However, traditional backups generally write the entire file each time a change is made. Many CDP solutions record changes at the byte or block level, capturing and saving only the actual bytes that are altered rather than capturing the entire file again each time it changes.
In many ways, CDP sounds similar to a redundant array of independent disks configuration. With RAID 1, a pair of disk drives will act as mirrors of each other. Every bit of data that is written to one drive is simultaneously written to the second drive, as well. If either of the drives in the RAID 1 configuration crashes or dies, the data is not lost because it all exists on the other drive.
There are two key differences between CDP and RAID. First is distance. With RAID, the multiple drives are usually housed together, often inside the server for which they are holding data. A catastrophe that knocks out the server will affect both drives and all of the data could potentially be lost. With CDP, the data is written to a remote drive that provides an extra layer of protection from events that affect the drives in the server itself.
The second difference is the preservation of previous versions. RAID drives only contain the current version of the data. They provide a level of redundancy to safeguard against data loss if one drive crashes, but they do not provide the ability to recover a version of a file that was backed up yesterday, or even an hour ago. CDP solutions are configured with a retention period and provide the ability to restore a file to the state it was in the previous hour, yesterday or last week, depending on the setting of the retention period.
The case for a continuous backup solution over a traditional backup solution seems to be pretty solid. Disk drive capacity is relatively cheap right now, and disk-based backup seems to be faster, cheaper and more efficient than the traditional tape backups. Writing changes to disk and recording only altered bits rather than entire files also make CDP solutions much less demanding on system resources or network bandwidth.
There are a number of backup solutions available that offer continuous backup functionality, including IBM Tivoli Continuous Data Protection for Files – license (CDW price $41.99) at http://www.cdw.com/shop/products/default.aspx?EDC=857720; CA XOsoft WANSync Standard Server – (v.4) – license (call CDW for price and availability) at http://www.cdw.com/shop/products/default.aspx?EDC=1062598; and Symantec Backup Exec 10d for Windows Servers – (v.10.1) – complete package (CDW price $509.99) at http://www.cdw.com/shop/products/default.aspx?EDC=862200
When shopping for a CDP solution, make sure you read the fine print. There is still some debate in the industry about the definition of “continuous.” Some products that are marketed as continuous backup are actually solutions that take snapshots of a given moment in time. They may capture the state of the system every few seconds or every few minutes rather than capturing every single instance of data being written. These solutions — often know as near-CDP —may be better than traditional backup but are not truly continuous, so make sure you know what you are purchasing.