Backup Book

Embed Size (px)

Citation preview

  • 8/11/2019 Backup Book

    1/30

    HOWTOCREATEANIMBLEBACKUPARCHITECTUREFORYOURDATACENTER

    OR,

    TEACHINGYOUR

    TAPE

    SYSTEMS

    TO

    DANCE

    WITHOUTSTOPPINGOFFFORASHOESHINE

  • 8/11/2019 Backup Book

    2/30

    2

    The Backup Book: Disaster Recovery from Desktop to Data Center

    COPYRIGHTINFORMATION

    Copyright 2003 by Network Frontiers, LLC

    All rights reserved.

    No part of this publication may be reproduced or transmitted in any form or byany means, electronic or mechanical, including photocopying, recording, or anyinformation storage and retrieval system, without permission in writing from theauthor.

    All brand names and product names mentioned in this book are trademarks orregistered trademarks of their respective companies.

    Schaser-Vartan BooksUS Offices: Feedback: [email protected] West Dayflower PathLecanto, FL 34461 http://www.backupbook.com

    SAN # 255-2582

    ISBN 0-9729039-0-9

    Library of Congress Catalog Card Number: 2001274299

    For more information on Quantums tape libraries or DX30 Enhanced BackupSolutions, visit them at:

    http://www.quantum.com/storagesolutions

    or call 1 (866) 827-1500or1 (410) 421-8999, ext 14 to speak to a Quantum Govt. specialist.

    Courtesy of

  • 8/11/2019 Backup Book

    3/30

    CHAPTERADDENDUM:HOWTOCREATEANIMBLEBACKUP

    ARCHITECTUREFORYOURDATACENTER

    (OR, TEACHINGYOURTAPESYSTEMSTODANCEWITHOUTSTOPPINGOFFFORASHOESHINE)

    Canyou create a nimble backup architecture for your data center? Sure, you canIf you design your architecture right, not only will it be able to danceyoull alsoovercome those data under-run problems that necessitate shoe shining!

    During my tenure as the CIO of True North Communications (one of the largesad agencies in the world, and now Interpublic Group), I was called into the officof Bruce Mason, our CEO at the time. Ive got a present for you; open it up,said Bruce, pointing to a small box on his coffee table. It was a pen. A nice penbut a pen? Itll make you look more professional in the MEC (the companyinternal board) meetings in front of the other CEOs, he smiled. For a second Iwas dumbfounded, and then I started to laughYou mean because Ive beenbringing in pads of paper and a bunch of good ol number-two pencils? I askedYep, youre all grown up now, so you can put them away. I thanked him, took

    the pen, and laughed all the way to my office.I had been going into our CEO management meetings laden with pencils becausI was drafting an outline for our new data centers backup and disaster recoveryplan. Each time the CEOs in the meetings had decided that the company should

  • 8/11/2019 Backup Book

    4/30

    4

    The Backup Book: Disaster Recovery from Desktop to Data Center

    go in a different direction (about two or three times per meeting, with many meet-ings over the course of a couple of months), I had to erase part of my plan and

    overwrite it. I wasnt going to put anythingin ink until I knew what the final planwas. As it turned out, even when everything was decided, and our data centerwas being rebuilt and finalized, the companys data needs changed in rapid orderand we had to shift our priorities again (and again and again).

    In the past, not a lot of us have considered data centers and backup architecturesto be particularly nimble and easily changed. But with the storage networkingtechnology that youll read about here and some smart choices on your part, youcan create a hardware plan that can be modified and reconfigured as quickly asyour boards most capricious whims.

    The first step toward finalizing your data centers backup architecture is making awise media choice. The amount of data you need to back up divided by yourbackup window length (usually eight hours) will clue you in to the throughputperformance youll need in your tape format. It will also determine the number oftape cartridges necessary for each backup. Throughput and tape count are two keydesign items in your plan.

    Once you know your throughput and cartridge count, you can make decisions aboutthe numbers of drives and the library cartridge capacity youll need to employ. Youllneed to decide whether you want to build a modular, stackable system that can beexpanded and upgraded occasionally, or whether you should consolidate yourbackup efforts and jump straight to a cabinet library that you can partition as nec-

    essary.Your final step is to round out your design by choosing a few backup options. Will youchoose to back up your NAS filers directly to your tape library over the GbE net-work using NDMP? What about leveraging an applications XCOPY (SCSIeXtended Copy) over Fibre Channel features? How will you overcome the prob-lems of shoe shining and interleaving? Each of these questions is asked andanswered once youve made your first two choices, because those choices will setthe boundaries of what you can and cant do within your backup plan and backuparchitecture.

  • 8/11/2019 Backup Book

    5/30

    Creating a nimble backup architecture for your data cente

    FIRSTSTEP: CHOOSEYOURSHOESCAREFULLY

    If youre going to standardize on anything in your data centers backup plan, ishould be the tape format for your backup media. Choosing a tape format is likchoosing a pair of shoes for your danceBirkenstocks are great for swaying to thGrateful Dead, but wouldnt do well in a tango. Do some research before makingyour decision. A thorough understanding of the market and the technologies ivital for mapping a strategy to handle your companys tape storage. Your choice oone technology over another should be based on up-front price, ongoing cost(TCO stuff ), speed (both backup and restore), reliability, and capacity.

    There are two basic types of tape recording formats: linear and helical scan. Letface itin the market, linear tape formats have won (well show you some chartin a page or two to prove our point), so lets just stick with linear tapes for thidiscussion. Linear tape technology uses a recording method in which data trackare written in a linear pattern on the tape. The first set of tracks is recorded in parallel over the entire length of the tape. The recording heads are then repositionedand the tape direction is reversed to record the next set of tracks, again across thentire length of the tape, repeating this process in a serpentine fashion (RemembeThe In-Laws? The original one? Peter Falk and Alan Arkin are beyond compare!until all data has been recorded. The major linear formats for data recording areDLTtape (normal, super, and DLT1), LTO (ultrium), and Travan (a great entry-

    level tape system).With that said, heres the scoop: This is a data center were talking about, not youhome-brew backup plan. You should either be looking at the DLTtape family othe LTO family of tape systems. If you want to read about the others, pick up a

    copy of our most recent book, The Backup Book1to get the skinny on all of thembecause the purview of the book is much broader than what youre reading hereHere, were talking about high-speed data centers, and therefore, half-inch tapeformats and tape automation libraries.

    1. The Backup Book: Disaster Recovery from Desktop to Data Center, by Dorian J. Cougias, E. L.Heiberger, and Karsten Koop. S-V Books, 2003.

  • 8/11/2019 Backup Book

    6/30

    6

    The Backup Book: Disaster Recovery from Desktop to Data Center

    Why LTO or DLTtape families?

    First, when planning for your data center, always try to plan for the winner in themarket share race. The last thing you want is to invest in a dead-end technology.While the helical scan markets comprise a very solid 36 percent of the tape markettoday, the linear DLTtape and LTO formats command a whopping 63 percent ofthe market.

    Figure Addendum-1. Worldwide market share in 2002

    And in a market the size of $2.1 billion in factory revenue, 63 percent representsa hugenumber. With market share clearly pointing to the DLTtape and LTO fam-

    ilies, how do you choose between them? Lets take a look at a couple of info-tablesto bring to light some of the choices you can make. Below are the basic differencesbetween the leapfrogging LTO and SDLT tape formats. DLT was the de factostandard for high-speed backups. LTO emerged and then SDLT leapfrogged overit. Then SDLT was followed by LTO-2, and now SDLT 320 and 600.

    Heres the 411 on the basics of both DLTtape and LTO formats.

    LTO-1 SDLT 320 LTO-2 SDLT 600

    Media Type Ultrium 1 SDLT-1 Ultrium 2 SDLT-2

    Servo MethodPre-Format*

    MagneticBacksideOptical

    Pre-Format*Magnetic

    BacksideOptical

    Cartridge Capacity (GB) 100 160 200 300Transfer Rate (MB/s) 15-16 16 30-35 34

    Max. GB/hr. (Native) 52-56 56 105-123 119

    Table Addendum-1. Basic differences between LTO and SDLT

    8mm HelicalScan

    12%

    LTO/DLT/SDLT

    63%

    4mm HelicalScan

    24%

    3480/90/3590

    9840/9940

    1%

    Source: Gartner DataQuest2003

  • 8/11/2019 Backup Book

    7/30

    Creating a nimble backup architecture for your data cente

    What you should notice in the table above is that the number of recording trackcontinues to increase with each iteration of the formats progression. This is onof the main reasons behind the capacity increases while the tape cartridges physical size remains the same. While the media costs rise, the cost per gigabyte is actually decreased, because the density ratios allow you to put more gigabytes in the

    same form factor. And as the density ratios increase, so do the throughput speedswhich brings us to our second table, wherein youll totally ignore everything wshowed you above because youre only wondering one thing: How fast do they go

    Back-Write Compatibility N/A SDLT220 LTO-1 N/A

    Back-Read Compatibility N/A

    DLT4k,7k,8k,

    DLT-1 / VS80 LTO-1

    SDLT220, 320,

    VS160Recording Tracks 384 448 512 720

    Recording Channels 8 8 8 16

    Power Consumption(write / streaming)

    28 watts 27 watts 28 watts 27 watts

    MTBF (@100% Tape Motion) 250,000 hrs 250,000 hrs 250,000 hrs 250,000 hrs

    Head Life (Tape Motion Hrs) 30,000 hrs 30,000 hrs 60,000 hrs 30,000 hrs

    Media Durability(end-to-end passes)

    20,000 20,000 20,000 20,000

    Archival Storage Life 30 years 30 years 30 years 30 years

    Media Cost(Street @ volume)

    $69 $99 $115 $149

    Media Cost / GB $0.69 $0.62 $0.58 $0.50

    * You cant bulk erase this format

    LTO-1 SDLT 320 LTO-2* SDLT 600

    Native Throughput (MB/sec.) 15 16 30 33

    Hourly Native Throughput (GB/hr.) 53 56 105 116

    4 Drive Native Throughput (GB/hr.) 211 225 422 464

    8 Drive Native Throughput (GB/hr.) 422 450 844 928

    12 Drive Native Throughput (GB/hr.) 633 675 1,266 1,392

    16 Drive Native Throughput (GB/hr.) 844 900 1,688 1,856

    20 Drive Native Throughput (GB/hr.) 1,055 1,125 2,109 2,320* Assumes HP LTO-2 (30 MB/s). IBM rated slightly higher (35 MB/s).

    Table Addendum-2. Basic throughput ratios

    Table Addendum-1. Basic differences between LTO and SDLT

  • 8/11/2019 Backup Book

    8/30

    8

    The Backup Book: Disaster Recovery from Desktop to Data Center

    The first important thing about tape speeds is the native throughput potential. Inother words, how many gigabytes per hour of data can you push onto your tape

    drives? The reality of the question depends upon your format, the number of drivemechanisms in your autoloader or library, and compress-ability of your data(SDLT and LTO both have built-in compression). Because theres no way in theuniverse we could figure out a constant for compression ratios, well show the rawnumbers in native, non-compressed formats in Table Addendum-2. on page 7,breaking them down into common drive numbers found in most autoloaders andtape libraries.

    Once you are done examining Table Addendum-2. on page 7, the tableyoull careabout for planning purposes is Table Addendum-3. because it shows how muchdata you can move through in a nights normal eight-hour backup window. Most

    organizations are forced to run backups when the servers being backed up arentin use, or at least are being used lessthan normal. This usually means that the datacenters backup routines begin when normal folks go to bed. And since most datacenters today measure the amount of data being backed up in terabytes instead ofgigabytes, we show the number of terabytes that can be backed up during thateight-hour window.

    Last but not least, lets go over the number of cartridges that youll need to holdyour data (see Table Addendum-4. on page 9). The amount of data that you haveand the number of cartridges it takes to store that data comprise one of the mostimportant factors to consider when you make your autoloader or library purchasedecision. If youve passed the 100-cartridge mark, with factoring in future growth,youre a candidate for multiple stacked libraries or a cabinet library system.

    Lets wrap this up with this thought. For simplicitys sake, lets say that you have10 TB of data that you have to move through your system and onto backup tapesduring your nightly eight-hour backup window. This means that if you choose

    LTO-1, youll need 25 drives and about 103 cartridges.

    LTO-1 SDLT 320 LTO-2* SDLT 600

    4 Drive / 8 Hour Native Throughput (TB) 1.65 1.76 3.30 3.63

    8 Drive / 8 Hour Native Throughput (TB) 3.30 3.52 6.59 7.25

    12 Drive / 8 Hour Native Throughput (TB) 4.94 5.27 9.89 10.88

    16 Drive / 8 Hour Native Throughput (TB) 6.59 7.03 13.18 14.50

    20 Drive / 8 Hour Native Throughput (TB) 8.24 8.79 16.48 18.13

    * Assumes HP LTO-2 (30 MB/s). IBM rated slightly higher (35 MB/s).

    Table Addendum-3. Throughput in TB during eight-hour backup window

    http://-/?-http://-/?-http://-/?-http://-/?-
  • 8/11/2019 Backup Book

    9/30

    Creating a nimble backup architecture for your data cente

    SDLT 320, youll need 23 drives and 64 cartridges.

    LTO-2, youll need 13 drives and 53 cartridges.

    SDLT 600, youll need 12 drives and 35 cartridges.

    As you can see, the faster and denser the tapes become, the fewer drives and fewecartridges youll need for your backup operations.

    Which dance steps are for you?

    Once you know the basics about your media needs, its time to take your first fewdance steps to see how nimble your planning can become. If data center backupoperations were simply a matter of moving all of your data onto backup tapes inone fell swoop, youd have this licked and probably wouldnt be reading this paperight now. Lucky for me as a writer, the world isnt that simple (it gives me a raisondtre, Lebenszweck, scopo da vitaa life purpose). Not only must you design

    your backup architecture to move from high to massive quantities of data, youmust design it to be versatile, reliable, and scalableand those qualities are rootedin the hardware choices you make at this stage of the game.

    Two types of tape systems are employed in todays data centers;

    Modular libraries that hold 20100 tapes and have one or more drives. Thesmodules can be stacked and arranged like building blocks; and

    Cabinet libraries that hold multiple drives (520) and as many tapes as youmight expect in something named library, and are extensible in ways thamodular libraries are not.

    LTO-1 SDLT 320 LTO-2 SDLT 600

    10 Terabyte (Native) 103 64 53 35

    20 Terabyte (Native) 205 128 103 69

    30 Terabyte (Native) 308 192 154 103

    40 Terabyte (Native) 410 256 205 137

    50 Terabyte (Native) 512 320 256 171

    Table Addendum-4. The cartridge count per Terabyte

  • 8/11/2019 Backup Book

    10/30

    10

    The Backup Book: Disaster Recovery from Desktop to Data Center

    Modular libraries

    Below is Quantums M2500 modular library (the largest in their series of modularlibraries). Each modular library has capacity for multiple tapes as well as for mul-tiple drive mechanisms (some lower-end autoloaders have only a single drivemechanism). The series begins with the M1500 (two drives and up to 25 tapes),progresses to the M1800 (four drives and 50 tapes) and ends with this one.

    Figure Addendum-2. Quantum M2500 tape library

    The M2500 modular library shown above can hold up to six drives and up to 84DLTtapes or 100 LTO tapes, giving it a native capacity of up to 20 TB. And ofcourse, since its modular, you can mix and match any of these systems to form a

    much larger one. Well cover that mix-and match concept in a few pages.

  • 8/11/2019 Backup Book

    11/30

    Creating a nimble backup architecture for your data cente

    1

    Cabinet libraries

    Cabinet libraries, as their name implies, offer massive capacity solutions for tapbackups. These are the enterprise systems for storage, and are quite powerful toolsQuantums PX720 shown below is one such tape library.

    Figure Addendum-3. Quantum PX720 tape library

    Cabinet libraries are employed when you need the greatest density of storage pecubic foot of storage space. These libraries belong in the data centers of the worldand are thus optimized for data center usage. The library shown above can holdup to 20 SDLT or LTO tape drive mechanisms and 732 tapes, effectively giving i200 TB of (native) storage capacity. And, with the ability to connect at least fiveof these together, thats one heck of a lot of data.

    Building scalability, flexibility, and reliability into your backup architecture

    When picking out a tape system for your organization, keep in mind three factors

  • 8/11/2019 Backup Book

    12/30

    12

    The Backup Book: Disaster Recovery from Desktop to Data Center

    1. Flexibility

    2. Modularity and scalability3. Density

    Theres no such thing as an organization that creates less and lessdata each month.And you know that as soon as you make up your mind about what you want toput into the system, the organization will find new ways to make more data; there-fore, your storage needs will grow. The only thing thats notgoing to grow is thesquare footage in your data center. So lets start with density first. Look for thehighest volume-per-rack-unit you can find. In other words, based on theamount of rack-units your tape system uses, how many GB or TB of data can youstore? If youre looking into a cabinet system, does the cabinet use only the back

    wall, or does it also use the left and right door space (who needs to look into a win-dow in a tape cabinet, anyway)?

    Second, flexibility must involve both types of scalabilityinternal growth andlink-ability (which also gives you additional reliability). Internal growth scalabilitymeans that the unit is large enough that you can add either drives or additionaltapes to the system without having to purchase an entirely different system. TheM2500, for example, allows the user to configure the system with one through sixinternal drive mechanisms, and can hold from one through 84 SDLT tapes andup to 100 LTO tapes. Thats what we mean by internal growth.

    Third, by adding stack-ability of a tape library system like the M2500, you can be

    assured that whatever money you spend on a library is a protected investment; nota wasted expense. Adding more modules to the first library (thus making it appearas a single unit to the backup software) means that the tape system can continuegrowing within the same rack unit while additional reliability is added to the sys-tem overall.

    In terms of reliability, with more than a single drive mechanism in any of the tapelibraries, the backup planner is pretty much guaranteed that at least oneof themwill be working at all times. If any of the drive mechanisms fail, the other mecha-nisms will take over for it until it can be repaired. However, if the entire unit fails,or the robot in the unit fails, thats another storyand the reason you want to look

    at link-ability for scalability and reliability plans. On the following page, we showthe M2500 on the left, and two M1800s on the right. Each of the M1800s holds42 SDLT tapes (half that of the M2500) and up to four tape drive mechanisms.

  • 8/11/2019 Backup Book

    13/30

    Creating a nimble backup architecture for your data cente

    1

    By linking two M1800s together, the backup planner now has a total of eight drivmechanisms and the same number of tape slots (84) as the M2500.

    Since Quantums M-Series allows additional units to be linked and therefore acas a single system, this enables expansion beyond the units physical limitation. Ialso offers greater reliability. Because all of the M-Series can be linked together, thbackup planner can start with a single M1800 and then add more M1800s oadditional M2500s until all 41 units in the rack are full. That could be threeM2500s, one M2500 and three M1800s, or five M1800syou get the picture.

    Flexibility sometimes means redesigning on-the-fly

    One of the things to ask about your modular tape library system is how flexibleis flexible. When stacking systems, some libraries make you take the systemdown completely and then hard-wire everything together, essentially rebuildingand reconfiguring the system completely. Make sure that you employ a modularsystem that allows you to re-stack and rebuild while the unaffected units are onlinand running. This avoids downtime and gives your design trueflexibility.

    Flexibility should extend to cartridge loading

    Quantum uses the magazine load approach with their M-Series tape libraries. Thigreatly enhances speed of movement between systems if one unit fails. Within a

    Figure Addendum-4. One M2500 or two stacked and linked M1800s

  • 8/11/2019 Backup Book

    14/30

    14

    The Backup Book: Disaster Recovery from Desktop to Data Center

    minute or two, tapes can be moved from the failed module to the good moduleand given read and write access.

    If hot dogs comes in packages of 10, why do hot dog buns come in packages ofeight? Where do you put the extra two? Think about where youre putting anotherdisplaced entity: your cleaning cartridge. It really bothers the heck out of me whena library forces me to put the cleaning cartridge in one of the magazine slots.Someone always forgets to take it out of the slot before the magazine goes offsite.Therefore, look at how your modular library handles cleaning cassettes as one ofthe keys to flexibility.

    The other key to magazine flexibility is the ability to rotate it and use in on bothsidesof the library. I really hate libraries that have distinct left and right maga-zineswho came up with thatbrilliant idea?

    And finally, flexibility in the cartridges and modularity also means this: If all thedrives in one of the M-Series modules is busy, within seven seconds or so (give ortake a few nanoseconds), the cartridge can be passed up or down the stack to adrive thats waiting for something to do.

    And flexibility should extend to your backup networks architecture

    Within todays data center network architectures, there are three traffic routes formoving data from the source to the backup server. The first method is to move thedata over the same production network that the normal traffic flows through. The

    second method is to move the data through a specially designed GbE sub-networkfor backup purposes only. And the third method is to move the data over a FibreChannel Storage Area Network (SAN).

    Of course, when adding backup servers over the normal network, youll need tomake sure that the backup servers are attached at the highest level of the networksbackbone. You wouldnt want the servers off on a spur, forcing traffic to gothrough several hubs, switches, or even routers. Each time your backups begingoing through routers and multiple switches, you begin adding time to the backupprocess because youre adding latency. On the next page (see Figure Addendum-5.on page 15),we show one of our networks that has a phalanx of servers and four

    different backup servers with their direct attached DLTtape libraries.

  • 8/11/2019 Backup Book

    15/30

    Creating a nimble backup architecture for your data cente

    1

    Figure Addendum-5. Backup servers on work LAN

    If you add four backup servers running simultaneously each night, make surwhen you add them directly to the network, that you add them on a switchedbackbone, or at least have a switched network segment that routes the traffi

    among the four individual backup servers and the computers theyre simultaneously backing up. Think of it: If the backup servers were all on the same hubtheyd have to divide the networks total bandwidth by 4. That would defeat thepurpose.

    Creating a GbEsub-network for

    backups

    The second method for running backups is to create a dedicated sub-network jusfor backups and management traffic. While this might be pretty hard to do fobacking up servers strewn throughout corporate facilities, its relatively easy in adata-center environment because of proximity and the open wiring architecture odata centers. In Figure Addendum-6. on page 16we segregated the backup servers onto their own switch and ran a second set of cables from each of the serverbeing backed up over to the backup sub-network (the dashed lines).

    PBX/Fax

    GL/AR/APApps & DBs

    HR/SchedulingPayroll

    Apps & DBs

    EIS/ReportingApps & DBs

    DNS/Web

    AD/DNS/SecurityProxy

    Print Server/RIP

    LDAP/eMailIM/SCM/POSApps & DBs

    NAS Boxes

    Backup Server 2

    Backup Server 1

    Backup Server 4

    Backup Server 3

    DLT Library A DLT Library B

  • 8/11/2019 Backup Book

    16/30

    16

    The Backup Book: Disaster Recovery from Desktop to Data Center

    Figure Addendum-6. Backup servers on sub network for backup only

    The additional costs are the gigabit Ethernet switch and each of the gigabit Ether-

    net cards that you must install into the servers being backed up.The benefits are many. The backups wont have to run over the standard produc-tion network, and therefore you wont have to worry about normal traffic interfer-ing with the backup traffic. You can install the fastest network cards in each of theservers being backed up, and therefore improve the speed over the normal networkspeeds.

    Creating a SANfor your backups

    Want a backup system superior to a sub-network? If you use a Storage Area Net-work (SAN) in your data centers server farm, you could run the backups over theFibre Channel SAN instead of a gigabit Ethernet. This provides the maximumamount of throughput, but is also the most complex method. As discussed previ-

    ously, when multiple computers back up across the network to a single tapelibrary, they transfer the data across that network to a single backup server that hasconnectivity to the tape device. On a SAN, each server can have equal access tothe same tape device. Because each machine on the SAN directly communicates

    PBX/Fax

    GL/AR/APApps & DBs

    HR/SchedulingPayroll

    Apps & DBs

    EIS/ReportingApps & DBs

    DNS/Web

    AD/DNS/SecurityProxy

    Print Server/RIP

    LDAP/eMailIM/SCM/POSApps & DBs

    NAS Boxes

    Normal Network

    Backup Network

    Backup Server 2

    Backup Server 1

    Backup Server 4

    Backup Server 3

    DLT Library A DLT Library B

  • 8/11/2019 Backup Book

    17/30

    Creating a nimble backup architecture for your data cente

    1

    with the tape device, transfers speeds are equivalent to having a device locallyattached to the servers PCI bus.

    In the diagram below, we show a small SAN with redundant fibre switches connecting each of the servers on the left with the storage farms on the right. Becausthe Quantum tape libraries and others like it can be connected to a SAN, andbecause most enterprise-level backup software can run backups over a SAN, thibecomes a real option. Weve kept the four backup servers in place because wneed four restore servers running simultaneously in case of building loss, and werfaced with restoring everything in the data center as quickly as possible. Remember, each backup server can restore only one device at a time from tape.

    Figure Addendum-7. Storage Area Network for backups

    Now, before you get all excited about the types of designs youve seen and whamodular libraries can do for you, let me cool your heels for a second. These designand modular libraries aregreatfor very specific backup architectures. If youre protecting your departments server farm, pick and choose your tape format, librarystyle, and backup architecture infrastructureyoull be well on your way to blissBut if youre designing a backup plan for an organization-wide data center, a

    backup plan that has severaland distinctgoals, you might want to hold off on youdecision to build modular and instead consider cabinet libraries that are both largand agile.

    GL/AR/AP

    HR/Scheduling

    NAS

    IM/SCM/POS

    IM/SCM/POS

    IM/SCM/POS

    GL/AR/AP

    HR/Scheduling

    NAS

    DLT B

    DLT A

    EIS/Reporting

    EIS/Reporting

  • 8/11/2019 Backup Book

    18/30

    18

    The Backup Book: Disaster Recovery from Desktop to Data Center

    Being big doesnt mean you cant be agile

    Modular tape libraries like Quantums M-Series are fantastic, packing a rack withtape density while ensuring flexibility in design. However, there comes a pointwhen you need more tape drives in the library than a modular system can handle.And with most organizations today following the beck and call to consolidate any-where possible, organizational data centers are being inundated with more andmore equipment that had previously been strewn hither and yon. Add to that theregulatory requirements for maintainingXdata sets over Yyears (such as criticalinformation for litigation support), and what you have is a data center with needsfor short-term archival solutions, long-term backup solutions, direct attachedstorage, Network Attached Storage (NAS), and Storage Area Networks all rolledinto one consolidated data center.

    In short, this means that youll need vast quantities of backup storage. The folksat Gartners Dataquest (July 2002) agree with me, noting the growth in sales fortape systems with less than 21 cartridges to be a measly 6 percent. Growth of salesfor tape systems with 21100 cartridges is expected to be 20 percent. And the gi-normous size range of 101+ cartridge systems have a growth rate expected to bearound 10 percent.

    In an enterprise data center, its not thathard to have 3040 TB of data that needsprotecting. If you remember your math of a few pages ago, that means that youreworking with a minimum of 200 tapes and with a backup window of eight hours,about 50 drives. The PX720 that we talked about earlier has several drive config-urations, with the ability to hold up to 20 drives and 732 tapes.

    If youre moving 70 to 80 TB of data through your consolidated enterprise datacenter, that means that youre dealing with roughly three cabinets full of drives andtape cartridges. One of the benefits of Quantums PX-Series products is that youcan scale up to five of the library chassis together, giving you up to 3,600 total car-tridges and 100 total drives.

  • 8/11/2019 Backup Book

    19/30

    Creating a nimble backup architecture for your data cente

    1

    Figure Addendum-8. Three PX720s that you might find in your data center

    While these systems may look daunting, the deftness and agility of their internaarchitecture is quite amazing.

    The Quantum PX-Series storage architecture

    The PX-Series offers three things you need when building a nimble data-centebackup architecture: flexibility, reliability, and storage density.

    Their flexibilityis delivered in the form of SCSI, 1 GB and 2 GB Fibre Channel, and IP storage interface connectivity and the range of drive options you

    can load into them. Reliabilityis achieved through their hot-swap drives and redundant powe

    supplies and fans.

    And they offer the highest TB per square footage densityin the market today

    The PX-Series GbE library leverages GbE for high-speed data movement and thNetwork Management Data Protocol (NDMP) for control. Multiple NetworkAttached Storage (NAS) filers can simultaneously share an NDMP-aware taplibrary, significantly decreasing the backup window and consolidating backupstorage to the same backup device.

    And speaking of canisters, each PX-Series library includes cableless, user-replaceable, hot-swap drives that dont interrupt any library operations, including datathroughput on a single common SCSI bus. Additional available hot-swap components include fans and power supplies. Both power supplies and fans are standard

    with N, N+1 or 2N redundancy2. Should a power supply or fan fail, the extra

  • 8/11/2019 Backup Book

    20/30

    20

    The Backup Book: Disaster Recovery from Desktop to Data Center

    component shares the load, eliminating the downtime that the failure would haveotherwise caused. The downed component can then be replaced at a convenient

    time without powering down the library; even while its performing backups andrestores. Separate power inputs allow the library to be configured on separatepower circuits so if power goes out on one line (circuit-breaker trip or failure,power cord accidentally disconnected, etc.), the library is unaffected.

    Figure Addendum-9. One of the canisters in a PX720 Series

    The Prism Fibre Channel router is an industry first: an integrated Fibre Channelcard with two 2 GB Fibre Channel and four Ultra-2 SCSI ports for high perfor-mance. One of the benefits of moving to a Fibre Channel backup network is thatthe PX-Series systems support XCOPY for serverless backup operations throughthe use of a Fibre Channel router. The router is actually embedded withinthe

    library per se3. For those applications that are XCOPY aware, serverless backupwill move the data directly from the disks used by the software to the tape systemover the FC pipe, thus bypassing the backup server. By eliminating the backupserver, XCOPY increases the availability of the servers resources and reduces thebackup window immensely.

    2. 2N is the doubling of a component, in case you couldnt figure that out.

    3. For more information about XCOPY, or the PX-Series in general, please referencehttp://www.theanswerisx.com.

  • 8/11/2019 Backup Book

    21/30

    Creating a nimble backup architecture for your data cente

    2

    For VERITAS NetBackup environments, Port Failover offers another layer oredundancy for a PX-Series library running within a 2 GB Fibre Channel Storag

    Area Network (SAN). Port Failover enables the third Fibre Channel port to breserved as a special fail-over port. Therefore, if one of the other two Fibre Channel ports fails, it will automatically fail over to the reserved port, preventing a disruption of the backup operation. In this scenario, the backup will continuunabated and an ALERT e-mail will be sent via the Prism Management Card tonotify the administrator of the event.

    Which brings me to the last stop on the PX-Series architecture tour: Quantumembedded Prism Management, which runs off the cabinet controller. This controller drives the Web-based management interface for the library system as welas the ALERT notification engine and the SNMP integration for the library. On

    of the Prism systems key features is its ability to remotely manage the partitioningof the servers drive librariesa mustfeature for consolidating heterogeneous environments like drive types, operating systems, and/or ISV applications. BecauseQuantums Web-based library partitioning is integrated within the librarys internal architecture (and therefore doesnt require an external server for setup), itboth cross-platform and cross-application compatible. As shown in the diagrambelow, each library can be partitioned into three segments, alleviating your needfor new libraries as your needs change. In the last part of this paper, well show youwhy partitioning (and re-partitioning when necessary) is an essential element of

    data centers backup operations.

    Web-based partitioning, individua

    canisters for ease of management, FibrChannel and SCSI connectivity withinthe same system, or a system with direcGbE connections to each drivethat

    what I call nimble. There is no mordexterous a Tape Library system thanthe PX-Series from Quantum. It seemthat the only problems they cant solvare those related to the throughput osome of your slower legacy systems. Bu

    while Quantum cant speed up thos

    old systems, they do have a unique wayof working around the problems thesslower units cause.

    Partitioning the libraries drives into three segments or virtual libraries

  • 8/11/2019 Backup Book

    22/30

    22

    The Backup Book: Disaster Recovery from Desktop to Data Center

    The last hurdle youll need to overcome: tape shoe-shining

    Lets get back to where we started this discussionyour tapes. Todays tape drivesread/write heads are positioned precisely against the tape, which then streamspast the heads. As the device writes to the tape, the read head is positioned so itverifies each frame of data after writing it. Because the tape continually streamspast the tape drive heads, a constant source of information must flow to the tapedrive to keep the tape streaming. When the drive runs out of information to writeto the tape (1); it must stop the tape (2); which takes a foot or two of actual tape,rewind the tape to a point behind where it left off (3); stop again (4); then startthe tape stream again (5); so that its up to speed by the time it reaches the lastpoint on which it wrote data. The process looks like the back-and-forth motionof a shoe shine.

    Figure Addendum-10. Tape shoe-shine process

    To overcome this problem of data under-runs, the backup software vendors

    invented a process called interleaving, also known as multiplexing, but well callit interleaving here. There are two basic methods: file interleaving and block inter-leaving. File interleaving writes file 1 from source 1 to the tape drive, bufferingincoming files from other sources while this occurs. Once the file has been writtento the tape, it then writes the next sequential incoming file. The diagram belowright shows three backup sources interleaving their files onto a single tape. Theonly difference between file interleaving andblock interleaving? Block interleaving writesdata to the tape in 32 K chunks (or in anotherfixed block state). Block interleaving is betterthan file interleaving in that if the tape encoun-

    ters a very large file, it doesnt have to handle itall at once before accepting data from the nextsource. Instead, it simply splits the large file intomultiple blocks, therefore accepting data moreevenly from each of the incoming sources.

    (1)(2)

    (3)(4)

    (5)

    NAS 1

    NAS 2

    IM 1

    IM 2

    POS 2

    POS 1

    IM/SCM/POS

    NAS Boxes

  • 8/11/2019 Backup Book

    23/30

    Creating a nimble backup architecture for your data cente

    2

    Interleaving is great for speeding up backups because it ensures that the tape drivis always running at its maximum speed threshold. However, its not so great fo

    restorations because it spreads the same amount of data from a single source ovea greater length of tape. During restoration, the tape must find the first chunkread it, jump to the next chunk, read it, and so on. Its a nice way to optimize thebackups (and cartridge capacity utilization), but its a complete disaster for restorebecause it injects a vast amount of latency that must be taken into consideration

    The biggest mistake in sizing a backup system is to look at backup requirementalone. The sole reason for doing backups is the ability to restore; therefore, threstore requirements are extremely important. You must ask yourself how muchtime is allowed to bring back the data during a restore. In a worst-case scenariotest we ran, using a high-performance HP/IBM LTO-2 tape drive (30 MBp

    native transfer speed) and 16 multiplexed clients means that each client streamwould have to supply only 2 MBps. Provided the network infrastructure andbackup server can handle the load, the LTO-2 will zoom along, streaming, and alclients will happily supply this data rate. Say it takes eight hours to complete aljobs.

    None of the backup packages on the market today can run multiplexed restoresThey all schedule restores in a queue. Each job is matched against a tape, which iloaded into the next available tape drive, read, and demultiplexed. Only thstream that is relevant to the job is used; all other streams are discarded.

    When restoring all the data to all systems, each client gets its data back at a measly

    1/16th of the backup rate. One in 16 blocks of data is part of his stream. Effective restore rate: about 2 MBps. Each tape must be read 16 times (once for eachstream), so now it takes 16*8 = 128 hours (over five days!) to fully restore all systems. Thats very bad.

    How do you avoid this problem? Your choices are to find a way to turn off interleaving (some applications wont let you do that though), thus being stuck withshoe shining, go back and redesign your entiredata flow path (we cover that in ThBackup Book) which can be daunting, or solve the problem with a staged backupprocess.

    In a staged backup process, you simply run the first set of backups to disk. Disk

    dont shoe shine. Disks dont save data in interleaved formats even though they canaccept multiplexed backups. Disks are a great intermediary for data-center backups because theyre a very reliable, simple technology thats been around for quitsome time. However, theres one small problem with backing up to disk: Mosbackup software of the data center caliber is set up for backing up to tape. Rear

  • 8/11/2019 Backup Book

    24/30

    24

    The Backup Book: Disaster Recovery from Desktop to Data Center

    ranging the software and changing the media-handling metaphors can be daunt-ing at best. Leave it to the Quantum engineers to come up with an answer both

    technologically sound and brilliantly simple: virtual tapes. Quantum has createda system thats actually made up of a series of RAID-protected hard drives but tobackup software looks and acts like a tape library with multiple tape drives.

    Virtual Tape Libraries

    The Quantum DX-Series of backup devices are Virtual Tape Library (VTL) prod-ucts that emulate a tape library, but are in fact a RAID system of very fast ATAhard drives. The DX family provides high-speed disk-to-disk backups, alleviatingthe problem of tape shoe-shining and therefore allowing the administrator to con-figure up to six simultaneous multiplexed streams. And since the multiplexedstreams are to disk, even though the DX systems read the backup file linearly, themultiplexed restores are 10 times faster than tape restores. In other words, its asfast as it could be, and you never have to worry about restoration slowdownscaused by multiplexing (while gaining the shorter backup window advantages ofmultiplexing).

    Figure Addendum-11. DX30 Virtual Tape Library

    What the backup software sees is what it would expect to see from a six-drive tapelibrary. In Figure Addendum-12. on page 25we show a screen from VERITASNetBackup running a backup to a 30-tape robot. That 30-tape robot is actuallythe DX30. To VERITAS NetBackup (the VERITAS products, Legatos Net-worker, Computer Associates, and others currently support this drive set.), its atape library with 30 tapes in it.

  • 8/11/2019 Backup Book

    25/30

    Creating a nimble backup architecture for your data cente

    2

    Figure Addendum-12. VERITAS NetBackup running a backup to a DX30

    Think of these DX systems as a two-part component set: a controller and the diskarray(s). Utilizing software that resides on the controller, the DX-Series configurthe disk arrays to look like a set of tapes and tape drives in a tape library. Thisallows zero configuration on the part of the backup software, as it now thinks thaits writing data to the DX30 in tape format and utilizing the standard media management schema (which is sized to fit the storage capacity of the specific DX system in use). The DX30 controller (1U in size) and disk array (3Us in size, with 24

    ATA drives in the array) are a combined unit and have all the features youll findin a mid-tier or enterprise system: Fibre Channel interface, SNMP, SES, RAID 5or 10 arrays for the drives, dual cooling and power, etc.

    The usable capacity of the first drive array in the DX30 is 3.2 TB (roughly equato about 16 LTO-1 or 10 SDLT 320 tapes, given compression ratios). Two additional arrays (of 4.3 TB each, because they can hold more data) can be added tothe system to bring the total expanded storage capacity up to 12.4 TB.

    Thats about where the comparison of a RAID 5 drive capacity of a disk array andthe DX system virtual tape capacity stopsbut remember, kids, this is a printedupdate to a book, and printed materials are frozen in time the moment the pres

    slaps ink onto paperby the time you read this, the capacity will probably havgrown faster than the weeds in my back yard.

    For one thing, using ATA-based drives usually isnt a good thing for fast backupsWhen reading and writing the small chunks of data that incremental backups cre

  • 8/11/2019 Backup Book

    26/30

    26

    The Backup Book: Disaster Recovery from Desktop to Data Center

    ate, ATA drives are much slower than the speedy SCSI-based drives in high-endRAID arrays. However, ATA drives are lightning fast when theyre used for

    sequential I/O large-block format writes. Because the DX family emulates a tapeversus a drive write, the writes personality is changed to the large-block formatthat works great for the ATA drive-base. A regular RAID array just cant do thatprecisely why the Quantum controller technology in the DX-Series family is soimportant to the process.

    When a backup software application (or an operating system, for that matter)writes to a RAID stripe, it has no idea of the optimal method to write the data; sothe stripe size can be mismatched with the size of the data block being written.This mismatching causes more CPU overhead, using more space than is necessary.In the DX-Series, the controller running the tape emulation software is also tuned

    to match the disk array, so it can more precisely match the stripe sizes of the arrayand use fewer CPU processes during the operation. Thats one smart controller!

    Where does the DX30 virtual tape library belong in backup network architecture?

    Because the DX family virtual tape systems write their data to hard drives, you stillneed to pair them with actual tape drivesso that you can rotate your tapes offsite.Therefore, within your architecture, they belong as a staging point for data. Theirhigh throughput ratios make them excellent primary targets for windowed nightlyincremental backups as well as weekly backups of huge quantities of data becausethey can move that data from the source to the virtual tape system much faster

    than backing up to real tapes. Whether youre dealing with a backup window youhave to live within, or moving massive amounts of weekly backups, speedis yourbest friend, and the DX-Series provides plenty of that. Once the backup windowis closed, you can move your data off to realtape.

    In an enterprise data-center environment, pairing a DX30 virtual tape system witha Fibre Channel/SCSI-based PX720 tape library makes perfect sense. Lets say thatyoure running a data center that has evolved over the last couple of years. Someof the equipment in there is legacyequipment, running at 500700 MHz speedswith drives that are a generation or so old. Along with that equipment, you havenewer, faster servers and faster hard drives. Youve also added a GbE sub-networkso that you dont have to run your backups over the production LAN or SAN. Andto top it off, you also have the ERP and Supply Chain database systems runningon a Fibre Channel SAN. All of this is in the same data center, under one roof,one budget, and feeding into one PX720 library system. How do you take advan-tage of the library while avoiding the inherent bottlenecks and problems that shoe

  • 8/11/2019 Backup Book

    27/30

    Creating a nimble backup architecture for your data cente

    2

    shining or interleaving will cause? Simpleteach the system to dance the foxtrottango, and waltz at the same time through partitioning the drives and adding a

    virtual tape library where necessary.

    Figure Addendum-13. on page 28shows an oversimplified diagram of just sucha datacenter, utilizing the PX720s multiple data paths and virtual library capabilities. In the diagram, we have three different data movers that are all sending theipayloads directly to the PX720.

    1. Our first data mover is the XCOPY (SCSI eXternal COPY) process going directly into the PX720. To the left of the diagram is a SAN with clusters oservers accessing and sharing data on a series of RAID and JBOD storage arrays. The first partition within the PX-Series virtual library is set for the direct XCOPY operation that the SAN and the on-board FC Router can proces

    without the need for a backup server.

    2. Our second data mover is a backup server that is directly attached to thPX720 through a SCSI connection (to the right of the SAN is Backup Serve1). In our scenario, this backup server will handle most of the load for thebackups and would probably have most of the drives within the PX720 assigned to its partition.

    3. Our third data mover is at the top right of the diagram. The PX720, BackupServer 2, and a DX30 virtual tape system are all interconnected on their ownFibre Channel SAN. In reality, you wouldnt need a dedicated Fibre Channefabric just for connecting a DX30we separated it in this diagram merely toshow that there can be three distinct data movement paths to any of Quantums Libraries. For our scenarios purposes, the third backup server andDX30 will own the final partition within the PX-Series library system. It wilbe used to back up the slower devices, eliminating both the shoe-shining (datunder-run) and interleaving problems that slower legacy systems create. Inthis type of scenario, after the backup window has closed, data will be movedto the tape library over the FC connection from the backup server using thbackup softwares inherent tape copy or net-vaulting operations.

  • 8/11/2019 Backup Book

    28/30

    28

    The Backup Book: Disaster Recovery from Desktop to Data Center

    Figure Addendum-13. Integrating the DX30 into the enterprise data-center backup environment

    In an architecture like this with a virtual tape library in the mix, the library canalso hold a duplicate of a tape backup so that the duplicate is poised and ready fora very speedy restore in the event of an emergency.

    The Main Thing

    The Main Thing that you understand here, the point that I hope you get afterreading this addendum, is that you have a lot of choices. Those choices begin witha tape format selection, but dont end there. You can build a backup plan that canisnt limited to a single set of dance steps. You can use the same equipment for dif-

    PBX/Fax

    GL/AR/AP

    HR/Scheduling

    EIS/Reporting

    DNS/Web

    AD/DNS/Security

    Print Server/

    LDAP/eMail

    IM/SCM/POS

    NAS Boxes

    Normal Production Network

    GbE backup sub-network

    Backup Server 1

    Backup Server 2

    DX30

    QuantumPX720

    Fibre Channel

    FC or

    FCStora

    geAreaNe

    twork

    XCOPY over FC

    1

    2

    3

    RAID

    andJBOD

    Quantum

  • 8/11/2019 Backup Book

    29/30

    Creating a nimble backup architecture for your data cente

    2

    ferent functions and different purposes running at different speedsas long ayour system, like those shown here from Quantum, offers you the flexibility, scal

    ability, and manageability you need to create a consolidated data-center backupplan that covers most of the problems youll face. You really can teach these Quantum systems to dance like Fred Astaire. And with technologies like the virtual tapesystem, you wont even need to stop for a shoe shine.

    For more information

    Hopefully, throughout this addendum youve been spurred on to think about youown data-center backup architecturewhat youre doing versus what you couldbe doing better, what you could be tweaking, or adding. This isnt all there is onthe subject, especially when it comes to great products like the DX30 and tapelibraries. I suggest that you use whats written here as a starting point and then godirectly to the source: Quantum. Weve worked closely with them on this adden-dum, and theyre more than happy to talk to you and provide what you need.

    For more information on Quantums tape libraries, visit them at:

    http://www.quantum.com/storagesolutions

    or call 1 (866) 827-1500or1 (410) 421-8999, ext 14 to speak to a Quantum Govt. specialist.

  • 8/11/2019 Backup Book

    30/30

    The Backup Book: Disaster Recovery from Desktop to Data Center

    INDEX

    ATA, 2426

    The Backup Book, 5, 24

    Computer Associates, 25

    DLT, 6, 10

    DLTtape, 56, 15

    DX-Series, 2426

    ERP, 27

    Fibre Channel

    SAN, 16, 27

    Storage Area Network, 14, 21

    Gartners Dataquest, 18

    Gigabytes, 78

    IP, 19

    Legatos Networker, 25

    LTO, 5-7, 1012

    Modular Libraries, 910, 17

    Modularity, 11, 14

    NAS, 4, 1819

    NDMP, 4, 19

    Network

    Attached Storage, 1819

    Management Data Protocol, 19

    Port Failover, 21

    Prism Management Card, 21

    Quantums

    Libraries, 27

    M-Series, 13-14, 18

    PX-Series, 11, 18-21, 26-28

    SCSI, 4, 1921, 27

    SDLT, 67, 9, 1112, 25

    SES, 25

    SNMP, 21, 25

    Storage Area Network, 1617

    Terabytes, 8

    VERITAS NetBackup, 21, 25

    Virtual Tape Library, 24, 2628

    XCOPY, 4, 2021, 27