8
 Vir tu al Tape Li br aries: The Best of T ape and Disk Backup By Ji m Lee, Director of Ma rketing, Yosemite Technol ogies www.yosemitetech.com Tape backup has traditionally been the mainstay of enterprise data protection when long-term data protection is required. As disk technologies have improved and economies of scale have driven down prices, disk adoption rates have surged and it would appear disk is poised to eclipse tape as the dominant backup platform and relegate tape to a minor archival and disaster recovery role. Numerous surveys of the enterprise have indicated adoption of disk-based backup reached the ninetieth percentile by the end of 2005 and that disks will soon be the dominant backup medium for the enterprise. But the flexibility, plunging costs, and commoditisation of disk has quickly led to a proliferation of ad hoc and confusing disk backup implementations which fail to provide many of the core data protection benefits of tape and even introduce new problems in an organisation’s backup regime. This article quickly reviews the disk-versus-tape debate, examines simple disk backup’s pros and cons, and proposes the virtual tape library (VTL) using disk as the backup medium as the best replacement for traditional tape backup. Finally, it discusses why all VTLs are not created equal and what to look for in a VTL when considering one for enterprise tape backup replacement. Tape is Probl ematic as a Backu p Medium Tape-based backup has been the dominant media and method for data protection since the main frame computing era and has had little challenge from competing technologies until recently. The problems with tape as a backup medium are well known. The most critical flaws exposed when relying on tape as a backup medium are:  Medium or cartridge failure during backup or restore  Unreported failures or incomplete backups

Yosemite Technologies Vtl Whitepaper

Embed Size (px)

Citation preview

  • 5/22/2018 Yosemite Technologies Vtl Whitepaper

    1/8

    Virtual Tape Libraries:The Best of Tape and Disk Backup

    By Jim Lee, Director of Marketing, Yosemite Technologieswww.yosemitetech.com

    Tape backup has traditionally been the mainstay of enterprise data protectionwhen long-term data protection is required. As disk technologies have improvedand economies of scale have driven down prices, disk adoption rates havesurged and it would appear disk is poised to eclipse tape as the dominant backupplatform and relegate tape to a minor archival and disaster recovery role.Numerous surveys of the enterprise have indicated adoption of disk-basedbackup reached the ninetieth percentile by the end of 2005 and that disks willsoon be the dominant backup medium for the enterprise. But the flexibility,plunging costs, and commoditisation of disk has quickly led to a proliferation ofad hoc and confusing disk backup implementations which fail to provide many ofthe core data protection benefits of tape and even introduce new problems in anorganisations backup regime. This article quickly reviews the disk-versus-tapedebate, examines simple disk backups pros and cons, and proposes the virtualtape library (VTL) using disk as the backup medium as the best replacement fortraditional tape backup. Finally, it discusses why all VTLs are not created equaland what to look for in a VTL when considering one for enterprise tape backupreplacement.

    Tape is Problematic as a Backup Medium

    Tape-based backup has been the dominant media and method for dataprotection since the main frame computing era and has had little challenge from

    competing technologies until recently. The problems with tape as a backupmedium are well known. The most critical flaws exposed when relying on tape asa backup medium are:

    Medium or cartridge failure during backup or restore

    Unreported failures or incomplete backups

  • 5/22/2018 Yosemite Technologies Vtl Whitepaper

    2/8

    Misplaced or mislabelled media

    Long backup windows and verifies

    Slow and serial access times

    In addition to being driven by the need for a better backup medium, the rapidadoption of disk backup as a complement to traditional tape has also been drivenby the expansion of data protection and service level requirements for themodern enterprise. Tape is simply too slow and unreliable to serve as the primarybackup medium for the always-on, 24x7 operational mode of most enterprises.Second, disk-based backup can take on many forms due to its superior flexibility,performance, and manageability characteristics thus substantially increasing thesolution space for the data protection of more complex IT infrastructures. Lastly,the rapid evolution in disk technologies and economies of scale have alloweddisk backup to sufficiently drop in cost to be regarded as superior to tape for most

    equivalent backup tasks. The relative strengths and weaknesses of disk- andtape-based backup schemes are shown in Table 1 below.

    Table 1 - Comparison of Disk versus Tape for Primary Data Protection

    Feature Disk Tape

    Reliability x

    Performance x

    Efficiency xFlexibility x

    Expense x

    Portability x

    Multi-level x

    Scalability x

    Durability x

    Redundancy x

    Inexpensive expansion x

    Simple Disk Backup Falls Short on Promise

    The first application of disk for backup was the simplest possible--replace tape asa medium and bring the strengths of disk as a medium to bear on the problems oftape which weaken tapes effectiveness in a data protection regime. The

    Virtual Tape Libraries: The Best of Tape and Disk Backup By Yosemite Technologies, page 2 of 8

  • 5/22/2018 Yosemite Technologies Vtl Whitepaper

    3/8

    implementation is straight-forward; instead of writing data serially to a tape driverand out to tape device, write the data in tape format to disk folders and later writeit to tape for longer term storage. As a straight tape replacement, backup-to-disk(B2D) at first appears to offer several immediate and apparent advantages overtraditional tape:

    increased write and read performance, dramatically shortening backupwindows as well as verify times, mean that backups can be done moreoften and can be verified within backup windows; moreover, restores canbe accomplished much more rapidly

    RAID levels offering superior data redundancy and fault-tolerance withmuch higher mean-time between failure (MTBF) rates which reduces therisk of data loss

    simplified configuration and management utilising familiar disk

    administrative tools

    Since most B2D solutions simply write the backup instances and metadata todisk-based folders, this makes disk-based backups subject to file systemlimitations. The reliance on the file system adds another layer to backup mediamanagement (even though in this case it is the more familiar disk managementlayer), which makes disk backups more complex than they really need to be.Rarely can capacity be added as inexpensively and on the fly as it can with tapewhich simply requires more cartridges to be added. With IT storage budgetsunder continual pressure, the flexibility, plunging cost, and relativecommoditisation of disk has led to a proliferation of ad hoc and confusing diskbackup configurations as well as a rush to replace tape with disk without regardto the impact on data protection policies, backup integrity, or efficiency. In simplebackup to disk configurations, traditional backup techniques such as generationalbackup sets and tape rotations have proved difficult to implement and a desire forcontinued access to their benefits has forced a difficult choice: either continue touse tape or lose its benefits in the data protection scheme. Thus in many cases,it has turned out that disk as a straight tape replacement has proved to bechallenging if not problematic as integration, file size limits, format issues, anddistribution of data across media increase the complexity and limitations ofreliable and efficient B2D solutions. As a result, reliance on tape has beenprolonged.

    There are some key attributes of tape backups that simple B2D does not provideas noted in the table above and, as a consequence, despite the manyadvantages of B2D, tape has remained an important component in the enterprisedata protection regime. A summary of the pros and cons is given in Table 2below. While it is evident B2D brings much of the promise of increased reliability,performance, and simplicity over tape, it does not sufficiently replace core tape

    Virtual Tape Libraries: The Best of Tape and Disk Backup By Yosemite Technologies, page 3 of 8

  • 5/22/2018 Yosemite Technologies Vtl Whitepaper

    4/8

    features and there is ample room for improvement of this simple implementationof disk-based data protection.

    Table 2 - Pros and Cons of Disk-based (B2D) Backup

    Pros ConsShorter backup window Separate backup catalogue

    Faster restore Expensive media increments

    Increased media reliability No generational media sets

    Data redundancy (RAID) File system limitations on size

    Flexible targets Limited portability of backup media

    Random-access to data Entire volumes containing multiplebackups are vulnerable to virus

    infectionSimple data copying

    In sum, while disk as a medium has great promise, effective disk backupsolutions must be built on leveraging disks advantages through new backupstrategies and techniques to ensure new problems and weaknesses are notintroduced and that most of the benefits of the technology disk is replacing arenot lost.

    Virtualising TapeAs we have noted, simply replacing tape as the backup medium is far fromoptimal for three main reasons: it does not leverage all of disks qualities, it doesnot minimise the risks from disks weaknesses, nor does it replace several corebenefits of tape as a medium such as generational backup sets and mediarotations. As result, although disk is rapidly eclipsing tape in the majority of dataprotection applications, more advanced uses of disk hold much more promisethan simple B2D. In addition to merely replacing tape as a medium, the increasedflexibility, simpler management, superior reliability, lower latency, and higherperformance characteristics of disk allow the integration of new data protectiontechniques such as near-line, on-line, snapshot, and continuous data protectionas well as emulation of other media into the enterprise backup regime withoutcompromising the level of data protection. These techniques take much greateradvantage of the relative strengths of disk over tape and combine more easilywith tape for those few characteristics where tape is still superior.

    One such promising technology which is quickly gaining acceptance is the virtualtape library (VTL). A VTL is essentially a disk-based file storage system which

    Virtual Tape Libraries: The Best of Tape and Disk Backup By Yosemite Technologies, page 4 of 8

  • 5/22/2018 Yosemite Technologies Vtl Whitepaper

    5/8

    provides an interface which emulates a tape drive and makes it appear to a backapplication as a physical tape library. Typically composed of a platform,essentially an operating system and server, and an application which presents atape device interface to an operating systems tape drivers so that backupapplications can write to virtual tapes just as they do to real physical tapes, VTLs

    combine many of the characteristics and advantages of both tape and disk listedin Table 1 above. Among them are speed, manageability, reliability, scalability,quick backup, verification, and restore, and generational or save-set and mediarotation support. The typical VTL offers significantly reduced media management,easy setup, data/image mobility, and substantial performance gains.

    However VTLs, like any complex technology, are vulnerable to trade-offs made intheir design. Most VTLs are designed by vendors other than backup applicationvendors and are loosely coupled with backup applications. It is this lack ofintegration that becomes problematic.

    The VTL: Great Innovation, Poor Execution

    Conventional VTLs are third-party appliances which are loosely coupled to abackup application since they are accessed through tape drivers in the samefashion as physical tape devices. This loose coupling has a number of significantdisadvantages: key among them are separate catalogues, high cost ofownership, and often a serious I/O bottleneck.

    The first critical drawback of an external VTL is the lack of integration with the

    backup applications: VTLs are often restricted to operating as a particular devicetype such as LTO and they must emulate physical tape operations in order tomaintain their masquerade. In order to use disk which is written and readdifferently from the tape they are emulating, VTLs create an additional andtransparent data path in the disk subsystem that is not visible from the backupapplication. Because of this additional data path, most external VTLs mustmaintain a separate catalogue recording the location of the data which is invisibleto the backup application. This separate catalogue will then need to be searchedduring restore and verify operations before data can be located, staged, andfound in the correct place by the backup catalogue. This compound catalogue isnot integrated and cross indexed with the catalogue of the backup application in

    a typical VTL and therefore can slow down the restore process significantly.

    A second disadvantage of VTL appliances is that they are built on separateplatforms and restrict choices in hardware and software options, thus increasingthe cost of expansion and eliminating existing capacity. The table below showsthe substantial increase in cost of ownership of an external VTL.

    Virtual Tape Libraries: The Best of Tape and Disk Backup By Yosemite Technologies, page 5 of 8

  • 5/22/2018 Yosemite Technologies Vtl Whitepaper

    6/8

    The third significant disadvantage is a negative impact on backup and restoreperformance. Modern backup architectures for large data sets are typically three-tier utilising multiple output devices and parallel data streams. Typical VTLappliances are built on small server platforms and isolate the VTL function andI/O on one device even though the backup application and source data might be

    distributed over many servers. This reliance on a single device creates a sizeablebottleneck for these data streams during backup and restore operations. Figure 1shows both the IO bottleneck created by isolating the entire VTL on one deviceas well as the reliance on proprietary disk which increase costs. For small andmedium-sized enterprise with substantial data sets, this can significantly throttledown I/O traffic.

    Figure 1 - External VTL Appliance has IO bottlenecks and expensive proprietary disk

    In summary, there are key drawbacks of external VTLs that cause them to fallshort of delivering on the full promise of back up to disk as summarised below.The lack of backup application integration reduces performance on restores,hardware restrictions increase the total cost of ownership (TCO), andconcentrating the implementation in one device on the edge of the backuparchitecture creates an I/O bottleneck. These drawbacks materially decrease thepotential advantage this liaison of tape policies with disk-based implementationcould provide. But there is an answer and it lies in eliminating the loose couplingof the VTL with the backup application. It is important to note that the limitationsabove, even the ones based on disk characteristics, can be designed out of a

    VTL implementation, but this requires more significant integration with the backupapplication and architecture than most third-party VTLs provide.

    Summary of Drawbacks of External VTLs

    Separate catalogues lengthen restore times since two catalogues must bemanaged

    File system limitations limit backup choices and flexibility

    Virtual Tape Libraries: The Best of Tape and Disk Backup By Yosemite Technologies, page 6 of 8

  • 5/22/2018 Yosemite Technologies Vtl Whitepaper

    7/8

    Restrictions on storage to increase TCO and narrow choices

    Inability to use cost-effective JBOD instead of expensive bundled storage

    Existing storage capacity cannot be utilised

    All IO must go through the VTL hardware creating a significant potential

    bottleneck and reducing effectiveness of three-tier backup architectures

    VTLs Done Right

    A VTL serves only one applicationthe backup application. So it is counter-productive to encapsulate the function in one device and place it at one point inthe network ignoring principles of modern three-tier backup architecture. A moreelegant design would involve embedding the VTL in the backup application. Theadvantages of such an architecture are numerous and they resolve the criticalissues with external VTL implementations discussed previously.

    First, an embedded VTL shares the backup catalogue with the backupapplication, eliminating clumsy and time-consuming two-step restore operations.Second, by embedding the VTL in the backup application, the VTL has access toall resources and devices of the backup application and this allows the VTL toutilise multiple data streams and multiple devices. This provides the scalabilitypromised by a large-scale three-tier backup architecture and does not force thebackup data flow into a bottleneck. Third, embedding the VTL with the backupapplication means that heterogeneous disk from multiple platforms can be usedto build the VTL, allowing its composition from existing unused capacity as wellas cost-competitive disk. Figure 2 provides an overall picture of an embeddedVTLs advantages.

    Figure 2 - Embedded VTL with balanced IO utilising heterogeneous disk and existingcapacity

    The cost savings with an embedded VTL are substantial as evidenced by theTCO comparison in Table 3 below.

    Virtual Tape Libraries: The Best of Tape and Disk Backup By Yosemite Technologies, page 7 of 8

  • 5/22/2018 Yosemite Technologies Vtl Whitepaper

    8/8

    Table 3 - High-cost Disk Backup w ith VTL Appliances

    VTL Appliances are expensive to purchase and expand

    VTL Appliance Embedded VTL

    Initial Purchase (5TB VTL + BackupApplication)

    $53,000 $17,273

    Relative cost comparison 3x

    Expansion Cost per TB $4,800-13,000 $400-600

    Relative cost comparison 15-20x -

    What to look for in a VTL

    - single catalogue integrated with backup application for one-step restores

    - hardware and platform independence allowing economical expansion

    - heterogeneous storage support to allow integration of existing capacity

    - flexible targets allowing other media to be easily integrated

    - policy-based copy to other media or VTL to automate data protection options

    - high-level integration at the backup application level for three-tier architecturesupport and performance

    SummaryIn summary, disk-based backup has tremendous potential that goes far beyondsimple replacement of tape as a medium. Virtual tape libraries (VTL) arepotentially one of the most comprehensive and flexible applications of disk-basedbackup available because they can combine the advantages of disk technologywith traditional tape policies, the flexibility of multiple media targets, andintegration and utilisation of existing backup infrastructures. However, many third-party implementations are put on proprietary server platforms which isolate theVTL from the backup application, effectively limiting the potential benefits andgains from VTL technology. The penultimate solution is a VTL embedded in thebackup application allowing the following benefits: 1) the multi-processing/multi-

    streaming of a modern three-tier backup infrastructure, 2) an integrated anduniform backup catalogue, 3) utilisation of existing disk capacity acrossheterogeneous systems, and 4) the ability to use general purpose disk instead ofproprietary VTL disk. The embedded VTL provides the best of the benefits ofboth disk and tape media ensuring scalable, comprehensive, and reliable dataprotection.

    Virtual Tape Libraries: The Best of Tape and Disk Backup By Yosemite Technologies, page 8 of 8