Upload
sakmohan
View
224
Download
0
Embed Size (px)
Citation preview
8/11/2019 Backup Book
1/30
HOWTOCREATEANIMBLEBACKUPARCHITECTUREFORYOURDATACENTER
OR,
TEACHINGYOUR
TAPE
SYSTEMS
TO
DANCE
WITHOUTSTOPPINGOFFFORASHOESHINE
8/11/2019 Backup Book
2/30
2
The Backup Book: Disaster Recovery from Desktop to Data Center
COPYRIGHTINFORMATION
Copyright 2003 by Network Frontiers, LLC
All rights reserved.
No part of this publication may be reproduced or transmitted in any form or byany means, electronic or mechanical, including photocopying, recording, or anyinformation storage and retrieval system, without permission in writing from theauthor.
All brand names and product names mentioned in this book are trademarks orregistered trademarks of their respective companies.
Schaser-Vartan BooksUS Offices: Feedback: [email protected] West Dayflower PathLecanto, FL 34461 http://www.backupbook.com
SAN # 255-2582
ISBN 0-9729039-0-9
Library of Congress Catalog Card Number: 2001274299
For more information on Quantums tape libraries or DX30 Enhanced BackupSolutions, visit them at:
http://www.quantum.com/storagesolutions
or call 1 (866) 827-1500or1 (410) 421-8999, ext 14 to speak to a Quantum Govt. specialist.
Courtesy of
8/11/2019 Backup Book
3/30
CHAPTERADDENDUM:HOWTOCREATEANIMBLEBACKUP
ARCHITECTUREFORYOURDATACENTER
(OR, TEACHINGYOURTAPESYSTEMSTODANCEWITHOUTSTOPPINGOFFFORASHOESHINE)
Canyou create a nimble backup architecture for your data center? Sure, you canIf you design your architecture right, not only will it be able to danceyoull alsoovercome those data under-run problems that necessitate shoe shining!
During my tenure as the CIO of True North Communications (one of the largesad agencies in the world, and now Interpublic Group), I was called into the officof Bruce Mason, our CEO at the time. Ive got a present for you; open it up,said Bruce, pointing to a small box on his coffee table. It was a pen. A nice penbut a pen? Itll make you look more professional in the MEC (the companyinternal board) meetings in front of the other CEOs, he smiled. For a second Iwas dumbfounded, and then I started to laughYou mean because Ive beenbringing in pads of paper and a bunch of good ol number-two pencils? I askedYep, youre all grown up now, so you can put them away. I thanked him, took
the pen, and laughed all the way to my office.I had been going into our CEO management meetings laden with pencils becausI was drafting an outline for our new data centers backup and disaster recoveryplan. Each time the CEOs in the meetings had decided that the company should
8/11/2019 Backup Book
4/30
4
The Backup Book: Disaster Recovery from Desktop to Data Center
go in a different direction (about two or three times per meeting, with many meet-ings over the course of a couple of months), I had to erase part of my plan and
overwrite it. I wasnt going to put anythingin ink until I knew what the final planwas. As it turned out, even when everything was decided, and our data centerwas being rebuilt and finalized, the companys data needs changed in rapid orderand we had to shift our priorities again (and again and again).
In the past, not a lot of us have considered data centers and backup architecturesto be particularly nimble and easily changed. But with the storage networkingtechnology that youll read about here and some smart choices on your part, youcan create a hardware plan that can be modified and reconfigured as quickly asyour boards most capricious whims.
The first step toward finalizing your data centers backup architecture is making awise media choice. The amount of data you need to back up divided by yourbackup window length (usually eight hours) will clue you in to the throughputperformance youll need in your tape format. It will also determine the number oftape cartridges necessary for each backup. Throughput and tape count are two keydesign items in your plan.
Once you know your throughput and cartridge count, you can make decisions aboutthe numbers of drives and the library cartridge capacity youll need to employ. Youllneed to decide whether you want to build a modular, stackable system that can beexpanded and upgraded occasionally, or whether you should consolidate yourbackup efforts and jump straight to a cabinet library that you can partition as nec-
essary.Your final step is to round out your design by choosing a few backup options. Will youchoose to back up your NAS filers directly to your tape library over the GbE net-work using NDMP? What about leveraging an applications XCOPY (SCSIeXtended Copy) over Fibre Channel features? How will you overcome the prob-lems of shoe shining and interleaving? Each of these questions is asked andanswered once youve made your first two choices, because those choices will setthe boundaries of what you can and cant do within your backup plan and backuparchitecture.
8/11/2019 Backup Book
5/30
Creating a nimble backup architecture for your data cente
FIRSTSTEP: CHOOSEYOURSHOESCAREFULLY
If youre going to standardize on anything in your data centers backup plan, ishould be the tape format for your backup media. Choosing a tape format is likchoosing a pair of shoes for your danceBirkenstocks are great for swaying to thGrateful Dead, but wouldnt do well in a tango. Do some research before makingyour decision. A thorough understanding of the market and the technologies ivital for mapping a strategy to handle your companys tape storage. Your choice oone technology over another should be based on up-front price, ongoing cost(TCO stuff ), speed (both backup and restore), reliability, and capacity.
There are two basic types of tape recording formats: linear and helical scan. Letface itin the market, linear tape formats have won (well show you some chartin a page or two to prove our point), so lets just stick with linear tapes for thidiscussion. Linear tape technology uses a recording method in which data trackare written in a linear pattern on the tape. The first set of tracks is recorded in parallel over the entire length of the tape. The recording heads are then repositionedand the tape direction is reversed to record the next set of tracks, again across thentire length of the tape, repeating this process in a serpentine fashion (RemembeThe In-Laws? The original one? Peter Falk and Alan Arkin are beyond compare!until all data has been recorded. The major linear formats for data recording areDLTtape (normal, super, and DLT1), LTO (ultrium), and Travan (a great entry-
level tape system).With that said, heres the scoop: This is a data center were talking about, not youhome-brew backup plan. You should either be looking at the DLTtape family othe LTO family of tape systems. If you want to read about the others, pick up a
copy of our most recent book, The Backup Book1to get the skinny on all of thembecause the purview of the book is much broader than what youre reading hereHere, were talking about high-speed data centers, and therefore, half-inch tapeformats and tape automation libraries.
1. The Backup Book: Disaster Recovery from Desktop to Data Center, by Dorian J. Cougias, E. L.Heiberger, and Karsten Koop. S-V Books, 2003.
8/11/2019 Backup Book
6/30
6
The Backup Book: Disaster Recovery from Desktop to Data Center
Why LTO or DLTtape families?
First, when planning for your data center, always try to plan for the winner in themarket share race. The last thing you want is to invest in a dead-end technology.While the helical scan markets comprise a very solid 36 percent of the tape markettoday, the linear DLTtape and LTO formats command a whopping 63 percent ofthe market.
Figure Addendum-1. Worldwide market share in 2002
And in a market the size of $2.1 billion in factory revenue, 63 percent representsa hugenumber. With market share clearly pointing to the DLTtape and LTO fam-
ilies, how do you choose between them? Lets take a look at a couple of info-tablesto bring to light some of the choices you can make. Below are the basic differencesbetween the leapfrogging LTO and SDLT tape formats. DLT was the de factostandard for high-speed backups. LTO emerged and then SDLT leapfrogged overit. Then SDLT was followed by LTO-2, and now SDLT 320 and 600.
Heres the 411 on the basics of both DLTtape and LTO formats.
LTO-1 SDLT 320 LTO-2 SDLT 600
Media Type Ultrium 1 SDLT-1 Ultrium 2 SDLT-2
Servo MethodPre-Format*
MagneticBacksideOptical
Pre-Format*Magnetic
BacksideOptical
Cartridge Capacity (GB) 100 160 200 300Transfer Rate (MB/s) 15-16 16 30-35 34
Max. GB/hr. (Native) 52-56 56 105-123 119
Table Addendum-1. Basic differences between LTO and SDLT
8mm HelicalScan
12%
LTO/DLT/SDLT
63%
4mm HelicalScan
24%
3480/90/3590
9840/9940
1%
Source: Gartner DataQuest2003
8/11/2019 Backup Book
7/30
Creating a nimble backup architecture for your data cente
What you should notice in the table above is that the number of recording trackcontinues to increase with each iteration of the formats progression. This is onof the main reasons behind the capacity increases while the tape cartridges physical size remains the same. While the media costs rise, the cost per gigabyte is actually decreased, because the density ratios allow you to put more gigabytes in the
same form factor. And as the density ratios increase, so do the throughput speedswhich brings us to our second table, wherein youll totally ignore everything wshowed you above because youre only wondering one thing: How fast do they go
Back-Write Compatibility N/A SDLT220 LTO-1 N/A
Back-Read Compatibility N/A
DLT4k,7k,8k,
DLT-1 / VS80 LTO-1
SDLT220, 320,
VS160Recording Tracks 384 448 512 720
Recording Channels 8 8 8 16
Power Consumption(write / streaming)
28 watts 27 watts 28 watts 27 watts
MTBF (@100% Tape Motion) 250,000 hrs 250,000 hrs 250,000 hrs 250,000 hrs
Head Life (Tape Motion Hrs) 30,000 hrs 30,000 hrs 60,000 hrs 30,000 hrs
Media Durability(end-to-end passes)
20,000 20,000 20,000 20,000
Archival Storage Life 30 years 30 years 30 years 30 years
Media Cost(Street @ volume)
$69 $99 $115 $149
Media Cost / GB $0.69 $0.62 $0.58 $0.50
* You cant bulk erase this format
LTO-1 SDLT 320 LTO-2* SDLT 600
Native Throughput (MB/sec.) 15 16 30 33
Hourly Native Throughput (GB/hr.) 53 56 105 116
4 Drive Native Throughput (GB/hr.) 211 225 422 464
8 Drive Native Throughput (GB/hr.) 422 450 844 928
12 Drive Native Throughput (GB/hr.) 633 675 1,266 1,392
16 Drive Native Throughput (GB/hr.) 844 900 1,688 1,856
20 Drive Native Throughput (GB/hr.) 1,055 1,125 2,109 2,320* Assumes HP LTO-2 (30 MB/s). IBM rated slightly higher (35 MB/s).
Table Addendum-2. Basic throughput ratios
Table Addendum-1. Basic differences between LTO and SDLT
8/11/2019 Backup Book
8/30
8
The Backup Book: Disaster Recovery from Desktop to Data Center
The first important thing about tape speeds is the native throughput potential. Inother words, how many gigabytes per hour of data can you push onto your tape
drives? The reality of the question depends upon your format, the number of drivemechanisms in your autoloader or library, and compress-ability of your data(SDLT and LTO both have built-in compression). Because theres no way in theuniverse we could figure out a constant for compression ratios, well show the rawnumbers in native, non-compressed formats in Table Addendum-2. on page 7,breaking them down into common drive numbers found in most autoloaders andtape libraries.
Once you are done examining Table Addendum-2. on page 7, the tableyoull careabout for planning purposes is Table Addendum-3. because it shows how muchdata you can move through in a nights normal eight-hour backup window. Most
organizations are forced to run backups when the servers being backed up arentin use, or at least are being used lessthan normal. This usually means that the datacenters backup routines begin when normal folks go to bed. And since most datacenters today measure the amount of data being backed up in terabytes instead ofgigabytes, we show the number of terabytes that can be backed up during thateight-hour window.
Last but not least, lets go over the number of cartridges that youll need to holdyour data (see Table Addendum-4. on page 9). The amount of data that you haveand the number of cartridges it takes to store that data comprise one of the mostimportant factors to consider when you make your autoloader or library purchasedecision. If youve passed the 100-cartridge mark, with factoring in future growth,youre a candidate for multiple stacked libraries or a cabinet library system.
Lets wrap this up with this thought. For simplicitys sake, lets say that you have10 TB of data that you have to move through your system and onto backup tapesduring your nightly eight-hour backup window. This means that if you choose
LTO-1, youll need 25 drives and about 103 cartridges.
LTO-1 SDLT 320 LTO-2* SDLT 600
4 Drive / 8 Hour Native Throughput (TB) 1.65 1.76 3.30 3.63
8 Drive / 8 Hour Native Throughput (TB) 3.30 3.52 6.59 7.25
12 Drive / 8 Hour Native Throughput (TB) 4.94 5.27 9.89 10.88
16 Drive / 8 Hour Native Throughput (TB) 6.59 7.03 13.18 14.50
20 Drive / 8 Hour Native Throughput (TB) 8.24 8.79 16.48 18.13
* Assumes HP LTO-2 (30 MB/s). IBM rated slightly higher (35 MB/s).
Table Addendum-3. Throughput in TB during eight-hour backup window
http://-/?-http://-/?-http://-/?-http://-/?-8/11/2019 Backup Book
9/30
Creating a nimble backup architecture for your data cente
SDLT 320, youll need 23 drives and 64 cartridges.
LTO-2, youll need 13 drives and 53 cartridges.
SDLT 600, youll need 12 drives and 35 cartridges.
As you can see, the faster and denser the tapes become, the fewer drives and fewecartridges youll need for your backup operations.
Which dance steps are for you?
Once you know the basics about your media needs, its time to take your first fewdance steps to see how nimble your planning can become. If data center backupoperations were simply a matter of moving all of your data onto backup tapes inone fell swoop, youd have this licked and probably wouldnt be reading this paperight now. Lucky for me as a writer, the world isnt that simple (it gives me a raisondtre, Lebenszweck, scopo da vitaa life purpose). Not only must you design
your backup architecture to move from high to massive quantities of data, youmust design it to be versatile, reliable, and scalableand those qualities are rootedin the hardware choices you make at this stage of the game.
Two types of tape systems are employed in todays data centers;
Modular libraries that hold 20100 tapes and have one or more drives. Thesmodules can be stacked and arranged like building blocks; and
Cabinet libraries that hold multiple drives (520) and as many tapes as youmight expect in something named library, and are extensible in ways thamodular libraries are not.
LTO-1 SDLT 320 LTO-2 SDLT 600
10 Terabyte (Native) 103 64 53 35
20 Terabyte (Native) 205 128 103 69
30 Terabyte (Native) 308 192 154 103
40 Terabyte (Native) 410 256 205 137
50 Terabyte (Native) 512 320 256 171
Table Addendum-4. The cartridge count per Terabyte
8/11/2019 Backup Book
10/30
10
The Backup Book: Disaster Recovery from Desktop to Data Center
Modular libraries
Below is Quantums M2500 modular library (the largest in their series of modularlibraries). Each modular library has capacity for multiple tapes as well as for mul-tiple drive mechanisms (some lower-end autoloaders have only a single drivemechanism). The series begins with the M1500 (two drives and up to 25 tapes),progresses to the M1800 (four drives and 50 tapes) and ends with this one.
Figure Addendum-2. Quantum M2500 tape library
The M2500 modular library shown above can hold up to six drives and up to 84DLTtapes or 100 LTO tapes, giving it a native capacity of up to 20 TB. And ofcourse, since its modular, you can mix and match any of these systems to form a
much larger one. Well cover that mix-and match concept in a few pages.
8/11/2019 Backup Book
11/30
Creating a nimble backup architecture for your data cente
1
Cabinet libraries
Cabinet libraries, as their name implies, offer massive capacity solutions for tapbackups. These are the enterprise systems for storage, and are quite powerful toolsQuantums PX720 shown below is one such tape library.
Figure Addendum-3. Quantum PX720 tape library
Cabinet libraries are employed when you need the greatest density of storage pecubic foot of storage space. These libraries belong in the data centers of the worldand are thus optimized for data center usage. The library shown above can holdup to 20 SDLT or LTO tape drive mechanisms and 732 tapes, effectively giving i200 TB of (native) storage capacity. And, with the ability to connect at least fiveof these together, thats one heck of a lot of data.
Building scalability, flexibility, and reliability into your backup architecture
When picking out a tape system for your organization, keep in mind three factors
8/11/2019 Backup Book
12/30
12
The Backup Book: Disaster Recovery from Desktop to Data Center
1. Flexibility
2. Modularity and scalability3. Density
Theres no such thing as an organization that creates less and lessdata each month.And you know that as soon as you make up your mind about what you want toput into the system, the organization will find new ways to make more data; there-fore, your storage needs will grow. The only thing thats notgoing to grow is thesquare footage in your data center. So lets start with density first. Look for thehighest volume-per-rack-unit you can find. In other words, based on theamount of rack-units your tape system uses, how many GB or TB of data can youstore? If youre looking into a cabinet system, does the cabinet use only the back
wall, or does it also use the left and right door space (who needs to look into a win-dow in a tape cabinet, anyway)?
Second, flexibility must involve both types of scalabilityinternal growth andlink-ability (which also gives you additional reliability). Internal growth scalabilitymeans that the unit is large enough that you can add either drives or additionaltapes to the system without having to purchase an entirely different system. TheM2500, for example, allows the user to configure the system with one through sixinternal drive mechanisms, and can hold from one through 84 SDLT tapes andup to 100 LTO tapes. Thats what we mean by internal growth.
Third, by adding stack-ability of a tape library system like the M2500, you can be
assured that whatever money you spend on a library is a protected investment; nota wasted expense. Adding more modules to the first library (thus making it appearas a single unit to the backup software) means that the tape system can continuegrowing within the same rack unit while additional reliability is added to the sys-tem overall.
In terms of reliability, with more than a single drive mechanism in any of the tapelibraries, the backup planner is pretty much guaranteed that at least oneof themwill be working at all times. If any of the drive mechanisms fail, the other mecha-nisms will take over for it until it can be repaired. However, if the entire unit fails,or the robot in the unit fails, thats another storyand the reason you want to look
at link-ability for scalability and reliability plans. On the following page, we showthe M2500 on the left, and two M1800s on the right. Each of the M1800s holds42 SDLT tapes (half that of the M2500) and up to four tape drive mechanisms.
8/11/2019 Backup Book
13/30
Creating a nimble backup architecture for your data cente
1
By linking two M1800s together, the backup planner now has a total of eight drivmechanisms and the same number of tape slots (84) as the M2500.
Since Quantums M-Series allows additional units to be linked and therefore acas a single system, this enables expansion beyond the units physical limitation. Ialso offers greater reliability. Because all of the M-Series can be linked together, thbackup planner can start with a single M1800 and then add more M1800s oadditional M2500s until all 41 units in the rack are full. That could be threeM2500s, one M2500 and three M1800s, or five M1800syou get the picture.
Flexibility sometimes means redesigning on-the-fly
One of the things to ask about your modular tape library system is how flexibleis flexible. When stacking systems, some libraries make you take the systemdown completely and then hard-wire everything together, essentially rebuildingand reconfiguring the system completely. Make sure that you employ a modularsystem that allows you to re-stack and rebuild while the unaffected units are onlinand running. This avoids downtime and gives your design trueflexibility.
Flexibility should extend to cartridge loading
Quantum uses the magazine load approach with their M-Series tape libraries. Thigreatly enhances speed of movement between systems if one unit fails. Within a
Figure Addendum-4. One M2500 or two stacked and linked M1800s
8/11/2019 Backup Book
14/30
14
The Backup Book: Disaster Recovery from Desktop to Data Center
minute or two, tapes can be moved from the failed module to the good moduleand given read and write access.
If hot dogs comes in packages of 10, why do hot dog buns come in packages ofeight? Where do you put the extra two? Think about where youre putting anotherdisplaced entity: your cleaning cartridge. It really bothers the heck out of me whena library forces me to put the cleaning cartridge in one of the magazine slots.Someone always forgets to take it out of the slot before the magazine goes offsite.Therefore, look at how your modular library handles cleaning cassettes as one ofthe keys to flexibility.
The other key to magazine flexibility is the ability to rotate it and use in on bothsidesof the library. I really hate libraries that have distinct left and right maga-zineswho came up with thatbrilliant idea?
And finally, flexibility in the cartridges and modularity also means this: If all thedrives in one of the M-Series modules is busy, within seven seconds or so (give ortake a few nanoseconds), the cartridge can be passed up or down the stack to adrive thats waiting for something to do.
And flexibility should extend to your backup networks architecture
Within todays data center network architectures, there are three traffic routes formoving data from the source to the backup server. The first method is to move thedata over the same production network that the normal traffic flows through. The
second method is to move the data through a specially designed GbE sub-networkfor backup purposes only. And the third method is to move the data over a FibreChannel Storage Area Network (SAN).
Of course, when adding backup servers over the normal network, youll need tomake sure that the backup servers are attached at the highest level of the networksbackbone. You wouldnt want the servers off on a spur, forcing traffic to gothrough several hubs, switches, or even routers. Each time your backups begingoing through routers and multiple switches, you begin adding time to the backupprocess because youre adding latency. On the next page (see Figure Addendum-5.on page 15),we show one of our networks that has a phalanx of servers and four
different backup servers with their direct attached DLTtape libraries.
8/11/2019 Backup Book
15/30
Creating a nimble backup architecture for your data cente
1
Figure Addendum-5. Backup servers on work LAN
If you add four backup servers running simultaneously each night, make surwhen you add them directly to the network, that you add them on a switchedbackbone, or at least have a switched network segment that routes the traffi
among the four individual backup servers and the computers theyre simultaneously backing up. Think of it: If the backup servers were all on the same hubtheyd have to divide the networks total bandwidth by 4. That would defeat thepurpose.
Creating a GbEsub-network for
backups
The second method for running backups is to create a dedicated sub-network jusfor backups and management traffic. While this might be pretty hard to do fobacking up servers strewn throughout corporate facilities, its relatively easy in adata-center environment because of proximity and the open wiring architecture odata centers. In Figure Addendum-6. on page 16we segregated the backup servers onto their own switch and ran a second set of cables from each of the serverbeing backed up over to the backup sub-network (the dashed lines).
PBX/Fax
GL/AR/APApps & DBs
HR/SchedulingPayroll
Apps & DBs
EIS/ReportingApps & DBs
DNS/Web
AD/DNS/SecurityProxy
Print Server/RIP
LDAP/eMailIM/SCM/POSApps & DBs
NAS Boxes
Backup Server 2
Backup Server 1
Backup Server 4
Backup Server 3
DLT Library A DLT Library B
8/11/2019 Backup Book
16/30
16
The Backup Book: Disaster Recovery from Desktop to Data Center
Figure Addendum-6. Backup servers on sub network for backup only
The additional costs are the gigabit Ethernet switch and each of the gigabit Ether-
net cards that you must install into the servers being backed up.The benefits are many. The backups wont have to run over the standard produc-tion network, and therefore you wont have to worry about normal traffic interfer-ing with the backup traffic. You can install the fastest network cards in each of theservers being backed up, and therefore improve the speed over the normal networkspeeds.
Creating a SANfor your backups
Want a backup system superior to a sub-network? If you use a Storage Area Net-work (SAN) in your data centers server farm, you could run the backups over theFibre Channel SAN instead of a gigabit Ethernet. This provides the maximumamount of throughput, but is also the most complex method. As discussed previ-
ously, when multiple computers back up across the network to a single tapelibrary, they transfer the data across that network to a single backup server that hasconnectivity to the tape device. On a SAN, each server can have equal access tothe same tape device. Because each machine on the SAN directly communicates
PBX/Fax
GL/AR/APApps & DBs
HR/SchedulingPayroll
Apps & DBs
EIS/ReportingApps & DBs
DNS/Web
AD/DNS/SecurityProxy
Print Server/RIP
LDAP/eMailIM/SCM/POSApps & DBs
NAS Boxes
Normal Network
Backup Network
Backup Server 2
Backup Server 1
Backup Server 4
Backup Server 3
DLT Library A DLT Library B
8/11/2019 Backup Book
17/30
Creating a nimble backup architecture for your data cente
1
with the tape device, transfers speeds are equivalent to having a device locallyattached to the servers PCI bus.
In the diagram below, we show a small SAN with redundant fibre switches connecting each of the servers on the left with the storage farms on the right. Becausthe Quantum tape libraries and others like it can be connected to a SAN, andbecause most enterprise-level backup software can run backups over a SAN, thibecomes a real option. Weve kept the four backup servers in place because wneed four restore servers running simultaneously in case of building loss, and werfaced with restoring everything in the data center as quickly as possible. Remember, each backup server can restore only one device at a time from tape.
Figure Addendum-7. Storage Area Network for backups
Now, before you get all excited about the types of designs youve seen and whamodular libraries can do for you, let me cool your heels for a second. These designand modular libraries aregreatfor very specific backup architectures. If youre protecting your departments server farm, pick and choose your tape format, librarystyle, and backup architecture infrastructureyoull be well on your way to blissBut if youre designing a backup plan for an organization-wide data center, a
backup plan that has severaland distinctgoals, you might want to hold off on youdecision to build modular and instead consider cabinet libraries that are both largand agile.
GL/AR/AP
HR/Scheduling
NAS
IM/SCM/POS
IM/SCM/POS
IM/SCM/POS
GL/AR/AP
HR/Scheduling
NAS
DLT B
DLT A
EIS/Reporting
EIS/Reporting
8/11/2019 Backup Book
18/30
18
The Backup Book: Disaster Recovery from Desktop to Data Center
Being big doesnt mean you cant be agile
Modular tape libraries like Quantums M-Series are fantastic, packing a rack withtape density while ensuring flexibility in design. However, there comes a pointwhen you need more tape drives in the library than a modular system can handle.And with most organizations today following the beck and call to consolidate any-where possible, organizational data centers are being inundated with more andmore equipment that had previously been strewn hither and yon. Add to that theregulatory requirements for maintainingXdata sets over Yyears (such as criticalinformation for litigation support), and what you have is a data center with needsfor short-term archival solutions, long-term backup solutions, direct attachedstorage, Network Attached Storage (NAS), and Storage Area Networks all rolledinto one consolidated data center.
In short, this means that youll need vast quantities of backup storage. The folksat Gartners Dataquest (July 2002) agree with me, noting the growth in sales fortape systems with less than 21 cartridges to be a measly 6 percent. Growth of salesfor tape systems with 21100 cartridges is expected to be 20 percent. And the gi-normous size range of 101+ cartridge systems have a growth rate expected to bearound 10 percent.
In an enterprise data center, its not thathard to have 3040 TB of data that needsprotecting. If you remember your math of a few pages ago, that means that youreworking with a minimum of 200 tapes and with a backup window of eight hours,about 50 drives. The PX720 that we talked about earlier has several drive config-urations, with the ability to hold up to 20 drives and 732 tapes.
If youre moving 70 to 80 TB of data through your consolidated enterprise datacenter, that means that youre dealing with roughly three cabinets full of drives andtape cartridges. One of the benefits of Quantums PX-Series products is that youcan scale up to five of the library chassis together, giving you up to 3,600 total car-tridges and 100 total drives.
8/11/2019 Backup Book
19/30
Creating a nimble backup architecture for your data cente
1
Figure Addendum-8. Three PX720s that you might find in your data center
While these systems may look daunting, the deftness and agility of their internaarchitecture is quite amazing.
The Quantum PX-Series storage architecture
The PX-Series offers three things you need when building a nimble data-centebackup architecture: flexibility, reliability, and storage density.
Their flexibilityis delivered in the form of SCSI, 1 GB and 2 GB Fibre Channel, and IP storage interface connectivity and the range of drive options you
can load into them. Reliabilityis achieved through their hot-swap drives and redundant powe
supplies and fans.
And they offer the highest TB per square footage densityin the market today
The PX-Series GbE library leverages GbE for high-speed data movement and thNetwork Management Data Protocol (NDMP) for control. Multiple NetworkAttached Storage (NAS) filers can simultaneously share an NDMP-aware taplibrary, significantly decreasing the backup window and consolidating backupstorage to the same backup device.
And speaking of canisters, each PX-Series library includes cableless, user-replaceable, hot-swap drives that dont interrupt any library operations, including datathroughput on a single common SCSI bus. Additional available hot-swap components include fans and power supplies. Both power supplies and fans are standard
with N, N+1 or 2N redundancy2. Should a power supply or fan fail, the extra
8/11/2019 Backup Book
20/30
20
The Backup Book: Disaster Recovery from Desktop to Data Center
component shares the load, eliminating the downtime that the failure would haveotherwise caused. The downed component can then be replaced at a convenient
time without powering down the library; even while its performing backups andrestores. Separate power inputs allow the library to be configured on separatepower circuits so if power goes out on one line (circuit-breaker trip or failure,power cord accidentally disconnected, etc.), the library is unaffected.
Figure Addendum-9. One of the canisters in a PX720 Series
The Prism Fibre Channel router is an industry first: an integrated Fibre Channelcard with two 2 GB Fibre Channel and four Ultra-2 SCSI ports for high perfor-mance. One of the benefits of moving to a Fibre Channel backup network is thatthe PX-Series systems support XCOPY for serverless backup operations throughthe use of a Fibre Channel router. The router is actually embedded withinthe
library per se3. For those applications that are XCOPY aware, serverless backupwill move the data directly from the disks used by the software to the tape systemover the FC pipe, thus bypassing the backup server. By eliminating the backupserver, XCOPY increases the availability of the servers resources and reduces thebackup window immensely.
2. 2N is the doubling of a component, in case you couldnt figure that out.
3. For more information about XCOPY, or the PX-Series in general, please referencehttp://www.theanswerisx.com.
8/11/2019 Backup Book
21/30
Creating a nimble backup architecture for your data cente
2
For VERITAS NetBackup environments, Port Failover offers another layer oredundancy for a PX-Series library running within a 2 GB Fibre Channel Storag
Area Network (SAN). Port Failover enables the third Fibre Channel port to breserved as a special fail-over port. Therefore, if one of the other two Fibre Channel ports fails, it will automatically fail over to the reserved port, preventing a disruption of the backup operation. In this scenario, the backup will continuunabated and an ALERT e-mail will be sent via the Prism Management Card tonotify the administrator of the event.
Which brings me to the last stop on the PX-Series architecture tour: Quantumembedded Prism Management, which runs off the cabinet controller. This controller drives the Web-based management interface for the library system as welas the ALERT notification engine and the SNMP integration for the library. On
of the Prism systems key features is its ability to remotely manage the partitioningof the servers drive librariesa mustfeature for consolidating heterogeneous environments like drive types, operating systems, and/or ISV applications. BecauseQuantums Web-based library partitioning is integrated within the librarys internal architecture (and therefore doesnt require an external server for setup), itboth cross-platform and cross-application compatible. As shown in the diagrambelow, each library can be partitioned into three segments, alleviating your needfor new libraries as your needs change. In the last part of this paper, well show youwhy partitioning (and re-partitioning when necessary) is an essential element of
data centers backup operations.
Web-based partitioning, individua
canisters for ease of management, FibrChannel and SCSI connectivity withinthe same system, or a system with direcGbE connections to each drivethat
what I call nimble. There is no mordexterous a Tape Library system thanthe PX-Series from Quantum. It seemthat the only problems they cant solvare those related to the throughput osome of your slower legacy systems. Bu
while Quantum cant speed up thos
old systems, they do have a unique wayof working around the problems thesslower units cause.
Partitioning the libraries drives into three segments or virtual libraries
8/11/2019 Backup Book
22/30
22
The Backup Book: Disaster Recovery from Desktop to Data Center
The last hurdle youll need to overcome: tape shoe-shining
Lets get back to where we started this discussionyour tapes. Todays tape drivesread/write heads are positioned precisely against the tape, which then streamspast the heads. As the device writes to the tape, the read head is positioned so itverifies each frame of data after writing it. Because the tape continually streamspast the tape drive heads, a constant source of information must flow to the tapedrive to keep the tape streaming. When the drive runs out of information to writeto the tape (1); it must stop the tape (2); which takes a foot or two of actual tape,rewind the tape to a point behind where it left off (3); stop again (4); then startthe tape stream again (5); so that its up to speed by the time it reaches the lastpoint on which it wrote data. The process looks like the back-and-forth motionof a shoe shine.
Figure Addendum-10. Tape shoe-shine process
To overcome this problem of data under-runs, the backup software vendors
invented a process called interleaving, also known as multiplexing, but well callit interleaving here. There are two basic methods: file interleaving and block inter-leaving. File interleaving writes file 1 from source 1 to the tape drive, bufferingincoming files from other sources while this occurs. Once the file has been writtento the tape, it then writes the next sequential incoming file. The diagram belowright shows three backup sources interleaving their files onto a single tape. Theonly difference between file interleaving andblock interleaving? Block interleaving writesdata to the tape in 32 K chunks (or in anotherfixed block state). Block interleaving is betterthan file interleaving in that if the tape encoun-
ters a very large file, it doesnt have to handle itall at once before accepting data from the nextsource. Instead, it simply splits the large file intomultiple blocks, therefore accepting data moreevenly from each of the incoming sources.
(1)(2)
(3)(4)
(5)
NAS 1
NAS 2
IM 1
IM 2
POS 2
POS 1
IM/SCM/POS
NAS Boxes
8/11/2019 Backup Book
23/30
Creating a nimble backup architecture for your data cente
2
Interleaving is great for speeding up backups because it ensures that the tape drivis always running at its maximum speed threshold. However, its not so great fo
restorations because it spreads the same amount of data from a single source ovea greater length of tape. During restoration, the tape must find the first chunkread it, jump to the next chunk, read it, and so on. Its a nice way to optimize thebackups (and cartridge capacity utilization), but its a complete disaster for restorebecause it injects a vast amount of latency that must be taken into consideration
The biggest mistake in sizing a backup system is to look at backup requirementalone. The sole reason for doing backups is the ability to restore; therefore, threstore requirements are extremely important. You must ask yourself how muchtime is allowed to bring back the data during a restore. In a worst-case scenariotest we ran, using a high-performance HP/IBM LTO-2 tape drive (30 MBp
native transfer speed) and 16 multiplexed clients means that each client streamwould have to supply only 2 MBps. Provided the network infrastructure andbackup server can handle the load, the LTO-2 will zoom along, streaming, and alclients will happily supply this data rate. Say it takes eight hours to complete aljobs.
None of the backup packages on the market today can run multiplexed restoresThey all schedule restores in a queue. Each job is matched against a tape, which iloaded into the next available tape drive, read, and demultiplexed. Only thstream that is relevant to the job is used; all other streams are discarded.
When restoring all the data to all systems, each client gets its data back at a measly
1/16th of the backup rate. One in 16 blocks of data is part of his stream. Effective restore rate: about 2 MBps. Each tape must be read 16 times (once for eachstream), so now it takes 16*8 = 128 hours (over five days!) to fully restore all systems. Thats very bad.
How do you avoid this problem? Your choices are to find a way to turn off interleaving (some applications wont let you do that though), thus being stuck withshoe shining, go back and redesign your entiredata flow path (we cover that in ThBackup Book) which can be daunting, or solve the problem with a staged backupprocess.
In a staged backup process, you simply run the first set of backups to disk. Disk
dont shoe shine. Disks dont save data in interleaved formats even though they canaccept multiplexed backups. Disks are a great intermediary for data-center backups because theyre a very reliable, simple technology thats been around for quitsome time. However, theres one small problem with backing up to disk: Mosbackup software of the data center caliber is set up for backing up to tape. Rear
8/11/2019 Backup Book
24/30
24
The Backup Book: Disaster Recovery from Desktop to Data Center
ranging the software and changing the media-handling metaphors can be daunt-ing at best. Leave it to the Quantum engineers to come up with an answer both
technologically sound and brilliantly simple: virtual tapes. Quantum has createda system thats actually made up of a series of RAID-protected hard drives but tobackup software looks and acts like a tape library with multiple tape drives.
Virtual Tape Libraries
The Quantum DX-Series of backup devices are Virtual Tape Library (VTL) prod-ucts that emulate a tape library, but are in fact a RAID system of very fast ATAhard drives. The DX family provides high-speed disk-to-disk backups, alleviatingthe problem of tape shoe-shining and therefore allowing the administrator to con-figure up to six simultaneous multiplexed streams. And since the multiplexedstreams are to disk, even though the DX systems read the backup file linearly, themultiplexed restores are 10 times faster than tape restores. In other words, its asfast as it could be, and you never have to worry about restoration slowdownscaused by multiplexing (while gaining the shorter backup window advantages ofmultiplexing).
Figure Addendum-11. DX30 Virtual Tape Library
What the backup software sees is what it would expect to see from a six-drive tapelibrary. In Figure Addendum-12. on page 25we show a screen from VERITASNetBackup running a backup to a 30-tape robot. That 30-tape robot is actuallythe DX30. To VERITAS NetBackup (the VERITAS products, Legatos Net-worker, Computer Associates, and others currently support this drive set.), its atape library with 30 tapes in it.
8/11/2019 Backup Book
25/30
Creating a nimble backup architecture for your data cente
2
Figure Addendum-12. VERITAS NetBackup running a backup to a DX30
Think of these DX systems as a two-part component set: a controller and the diskarray(s). Utilizing software that resides on the controller, the DX-Series configurthe disk arrays to look like a set of tapes and tape drives in a tape library. Thisallows zero configuration on the part of the backup software, as it now thinks thaits writing data to the DX30 in tape format and utilizing the standard media management schema (which is sized to fit the storage capacity of the specific DX system in use). The DX30 controller (1U in size) and disk array (3Us in size, with 24
ATA drives in the array) are a combined unit and have all the features youll findin a mid-tier or enterprise system: Fibre Channel interface, SNMP, SES, RAID 5or 10 arrays for the drives, dual cooling and power, etc.
The usable capacity of the first drive array in the DX30 is 3.2 TB (roughly equato about 16 LTO-1 or 10 SDLT 320 tapes, given compression ratios). Two additional arrays (of 4.3 TB each, because they can hold more data) can be added tothe system to bring the total expanded storage capacity up to 12.4 TB.
Thats about where the comparison of a RAID 5 drive capacity of a disk array andthe DX system virtual tape capacity stopsbut remember, kids, this is a printedupdate to a book, and printed materials are frozen in time the moment the pres
slaps ink onto paperby the time you read this, the capacity will probably havgrown faster than the weeds in my back yard.
For one thing, using ATA-based drives usually isnt a good thing for fast backupsWhen reading and writing the small chunks of data that incremental backups cre
8/11/2019 Backup Book
26/30
26
The Backup Book: Disaster Recovery from Desktop to Data Center
ate, ATA drives are much slower than the speedy SCSI-based drives in high-endRAID arrays. However, ATA drives are lightning fast when theyre used for
sequential I/O large-block format writes. Because the DX family emulates a tapeversus a drive write, the writes personality is changed to the large-block formatthat works great for the ATA drive-base. A regular RAID array just cant do thatprecisely why the Quantum controller technology in the DX-Series family is soimportant to the process.
When a backup software application (or an operating system, for that matter)writes to a RAID stripe, it has no idea of the optimal method to write the data; sothe stripe size can be mismatched with the size of the data block being written.This mismatching causes more CPU overhead, using more space than is necessary.In the DX-Series, the controller running the tape emulation software is also tuned
to match the disk array, so it can more precisely match the stripe sizes of the arrayand use fewer CPU processes during the operation. Thats one smart controller!
Where does the DX30 virtual tape library belong in backup network architecture?
Because the DX family virtual tape systems write their data to hard drives, you stillneed to pair them with actual tape drivesso that you can rotate your tapes offsite.Therefore, within your architecture, they belong as a staging point for data. Theirhigh throughput ratios make them excellent primary targets for windowed nightlyincremental backups as well as weekly backups of huge quantities of data becausethey can move that data from the source to the virtual tape system much faster
than backing up to real tapes. Whether youre dealing with a backup window youhave to live within, or moving massive amounts of weekly backups, speedis yourbest friend, and the DX-Series provides plenty of that. Once the backup windowis closed, you can move your data off to realtape.
In an enterprise data-center environment, pairing a DX30 virtual tape system witha Fibre Channel/SCSI-based PX720 tape library makes perfect sense. Lets say thatyoure running a data center that has evolved over the last couple of years. Someof the equipment in there is legacyequipment, running at 500700 MHz speedswith drives that are a generation or so old. Along with that equipment, you havenewer, faster servers and faster hard drives. Youve also added a GbE sub-networkso that you dont have to run your backups over the production LAN or SAN. Andto top it off, you also have the ERP and Supply Chain database systems runningon a Fibre Channel SAN. All of this is in the same data center, under one roof,one budget, and feeding into one PX720 library system. How do you take advan-tage of the library while avoiding the inherent bottlenecks and problems that shoe
8/11/2019 Backup Book
27/30
Creating a nimble backup architecture for your data cente
2
shining or interleaving will cause? Simpleteach the system to dance the foxtrottango, and waltz at the same time through partitioning the drives and adding a
virtual tape library where necessary.
Figure Addendum-13. on page 28shows an oversimplified diagram of just sucha datacenter, utilizing the PX720s multiple data paths and virtual library capabilities. In the diagram, we have three different data movers that are all sending theipayloads directly to the PX720.
1. Our first data mover is the XCOPY (SCSI eXternal COPY) process going directly into the PX720. To the left of the diagram is a SAN with clusters oservers accessing and sharing data on a series of RAID and JBOD storage arrays. The first partition within the PX-Series virtual library is set for the direct XCOPY operation that the SAN and the on-board FC Router can proces
without the need for a backup server.
2. Our second data mover is a backup server that is directly attached to thPX720 through a SCSI connection (to the right of the SAN is Backup Serve1). In our scenario, this backup server will handle most of the load for thebackups and would probably have most of the drives within the PX720 assigned to its partition.
3. Our third data mover is at the top right of the diagram. The PX720, BackupServer 2, and a DX30 virtual tape system are all interconnected on their ownFibre Channel SAN. In reality, you wouldnt need a dedicated Fibre Channefabric just for connecting a DX30we separated it in this diagram merely toshow that there can be three distinct data movement paths to any of Quantums Libraries. For our scenarios purposes, the third backup server andDX30 will own the final partition within the PX-Series library system. It wilbe used to back up the slower devices, eliminating both the shoe-shining (datunder-run) and interleaving problems that slower legacy systems create. Inthis type of scenario, after the backup window has closed, data will be movedto the tape library over the FC connection from the backup server using thbackup softwares inherent tape copy or net-vaulting operations.
8/11/2019 Backup Book
28/30
28
The Backup Book: Disaster Recovery from Desktop to Data Center
Figure Addendum-13. Integrating the DX30 into the enterprise data-center backup environment
In an architecture like this with a virtual tape library in the mix, the library canalso hold a duplicate of a tape backup so that the duplicate is poised and ready fora very speedy restore in the event of an emergency.
The Main Thing
The Main Thing that you understand here, the point that I hope you get afterreading this addendum, is that you have a lot of choices. Those choices begin witha tape format selection, but dont end there. You can build a backup plan that canisnt limited to a single set of dance steps. You can use the same equipment for dif-
PBX/Fax
GL/AR/AP
HR/Scheduling
EIS/Reporting
DNS/Web
AD/DNS/Security
Print Server/
LDAP/eMail
IM/SCM/POS
NAS Boxes
Normal Production Network
GbE backup sub-network
Backup Server 1
Backup Server 2
DX30
QuantumPX720
Fibre Channel
FC or
FCStora
geAreaNe
twork
XCOPY over FC
1
2
3
RAID
andJBOD
Quantum
8/11/2019 Backup Book
29/30
Creating a nimble backup architecture for your data cente
2
ferent functions and different purposes running at different speedsas long ayour system, like those shown here from Quantum, offers you the flexibility, scal
ability, and manageability you need to create a consolidated data-center backupplan that covers most of the problems youll face. You really can teach these Quantum systems to dance like Fred Astaire. And with technologies like the virtual tapesystem, you wont even need to stop for a shoe shine.
For more information
Hopefully, throughout this addendum youve been spurred on to think about youown data-center backup architecturewhat youre doing versus what you couldbe doing better, what you could be tweaking, or adding. This isnt all there is onthe subject, especially when it comes to great products like the DX30 and tapelibraries. I suggest that you use whats written here as a starting point and then godirectly to the source: Quantum. Weve worked closely with them on this adden-dum, and theyre more than happy to talk to you and provide what you need.
For more information on Quantums tape libraries, visit them at:
http://www.quantum.com/storagesolutions
or call 1 (866) 827-1500or1 (410) 421-8999, ext 14 to speak to a Quantum Govt. specialist.
8/11/2019 Backup Book
30/30
The Backup Book: Disaster Recovery from Desktop to Data Center
INDEX
ATA, 2426
The Backup Book, 5, 24
Computer Associates, 25
DLT, 6, 10
DLTtape, 56, 15
DX-Series, 2426
ERP, 27
Fibre Channel
SAN, 16, 27
Storage Area Network, 14, 21
Gartners Dataquest, 18
Gigabytes, 78
IP, 19
Legatos Networker, 25
LTO, 5-7, 1012
Modular Libraries, 910, 17
Modularity, 11, 14
NAS, 4, 1819
NDMP, 4, 19
Network
Attached Storage, 1819
Management Data Protocol, 19
Port Failover, 21
Prism Management Card, 21
Quantums
Libraries, 27
M-Series, 13-14, 18
PX-Series, 11, 18-21, 26-28
SCSI, 4, 1921, 27
SDLT, 67, 9, 1112, 25
SES, 25
SNMP, 21, 25
Storage Area Network, 1617
Terabytes, 8
VERITAS NetBackup, 21, 25
Virtual Tape Library, 24, 2628
XCOPY, 4, 2021, 27