19
IMPLEMENTING COLLOCATION GROUPS #1

IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #1

Page 2: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #2

About Draper Lab

• An independent, not-for-profit corporation dedicated to applied research, engineering development, education, and technology transfer– Spun off from the Massachusetts Institute of

Technology in 1973– Expertise in guidance, navigation and control

systems– Early applications: U.S. Navy's Fleet Ballistic

Missile Program and NASA's Apollo Program

Page 3: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #3

Agenda

• Why collocation groups?• ITSM code components• Additional tools• A process to move 40TB• Conclusions

Page 4: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #4

Why do I want Collocation groups?

• Number of nodes vs. number of slots1. Nodes < slots; collocate by node or

filespace2. Nodes > slots; can't collocate

• If collocate is on, no control of node mixing, still 1 mount per node during migration

• Node size vs. tape capacity1. Size > tape cap.; collocation fills tape2. Size < tape cap.; collocation wastes tape

• Collocation by group makes "supernodes" which work for both case 1's

Page 5: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #5

Server Configuration

233,000,0004,300,000For desktopsSD2

40230,000,00045,400,000For desktopsSD

1847,500,0008,500,000For serversSS

65,0005,500Library managerLM

Physical TB Number of files

DB size (pages)FunctionAcronym

• Sun v480 4 processors, Solaris 9• Raw disk for db, log, backuppool, no raid• TSM server code at 5.3.1.3

Page 6: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #6

The starting SD server mess

• Volumes– 417 to process– Average nodes / volume is 188– Max is 713– 25 are over 500

• Nodes– 1635 nodes– Average 48 volumes / node– Max is 132– 25 are over 100

Page 7: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #7

New server commands

• Def, del, upd, query collocgroup– names and describes the group

• Def, del collocmember– adds a node to a group

• Query nodedata– Very fast!!– Lists tapes which have files for a node or

group, no separation by filespace• Upd stgpool colloc=group

Page 8: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #8

The secret perl scripts

• 4 scripts in the bin directory, not documented• Used only defgroups.pl

– Analyzes ‘q occ’ data, creates define statements to build the groups

– Execution• ./defgroups.pl id pwd domain size [execute]

Page 9: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #9

Fix the defgroups.pl SQL

• Eliminate stgpool subselect– Change stgpool subselect to in list, name

your tape stgpool• Eliminate join between nodes & occupancy

– check domain_name with a subselect• Eliminate check for a collocgroup

– It is always null while implementing• Runtime drops from “beyond the limits of my

patience” to 5 minutes

Page 10: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #10

Using Query Nodedata

• SQL generates a command for each node– Also 'q nodedata * stg=pool_name'

• Run file from step1, direct output to 2nd file– ‘q nodedata’ doesn’t have a corresponding

SQL table (the very expensive volumeusage table is close)

• Edit output to get only node name and volume name

• Load into MySql• Analyze

Page 11: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #11

Tools

• MySQL desktop development server– Very handy to have!– No select for nodedata– Do complex joins without killing the server– http://www.mysql.com/

• UltraEdit editor– Sorting, column editing, hex editing– http://www.ultraedit.com/

Page 12: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #12

Preliminaries

• Decide target number of tapes in each group– Convert it to 'size in megs' for defgroups.pl– Goal is 4 tapes– We compress at the client, so lto2 capacity is

200G, 'size' is 800,000• Run defgroups.pl on domain(s)• Execute the commands from defgroups.pl• 'Update stgpool <name> colloc=group'• Mark all current tapes readonly

– Stops migrating to uncollocated filling tapes– Makes SQL easier

• Have as many scratch tapes as groups

Page 13: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #13

A process to minimize tape mounts

• By turning on collocation by group, a move or reclaim within the tapepool will need an output tape mount for each collocgroup on the input tape.– Potentially very slow, stressful for the tape

drives• Solution is to move data from tape to devt=file

pools on disk where files are put into groups, then migrate back to tape.

Page 14: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #14

Storage pools

• 3 sequential pools on disk– seqdisk3, seqdisk4, seqdisk5

• 2 pools receive data from tapes– Seqdisk3 & 4 each have 2 69G volumes– Not collocated, moves don’t reconstruct

• Seqdisk5 receives data from seqdisk3 & 4– 170 8GB volumes on 10 146GB drives,

each with its own file system.– Collocated by group, moves reconstruct

Page 15: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #15

The schedules and scripts

• Each script is executed every 10 minutes by a schedule– 6 similar schedules for each script

• For script a, run at 00:00, 00:10, 00:20, etc.• T4_VOLUMES_ODD, move odd numbered volumes

to seqdisk3• T4_VOLUMES_EVEN, move even numbered

volumes to seqdisk4• T4_MOVES, moves seqdisk3 & 4 volumes to

seqdisk5• T4_MIGRATES, starts migration of seqdisk5 to tape• T4_VOLUMES_DIRECT move some tapes direct to

tape

Page 16: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #16

SQL to make the scripts

• Use a file as a macro to create the script• The T4_VOLUMES* script has a prolog with logic

– checks if backuppool migration is running, exit if yes

– checks if SEQDISK3 is being used, exit if yes– checks for space in SEQDISK3, if yes then run

• Run SQL to select odd/even volumes ordered by pct_utilized and append it to the file

• For each volume, need 4 lines in the script– test if the volumes is still full or filling– goto script lines to issue a move command– issue the move command– exit

Page 17: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #17

Other methods to move all that data

• Direct tape to tape within the pool– Not as bad as I had feared!– Analyzed which tapes had the fewest groups

on them and moved them tape to tape.• Of 278 tapes, 219 have 30 or more (42 max)

• Move nodedata direct tapes to tapes– Move nodedata list-of-all-the-nodes-in-group– Need extra scratch tapes because source

tapes aren't emptied quickly

Page 18: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #18

The results so far

• Started on Aug-5, results as of Sep-8• Volumes

– 160 to process– Average nodes / volume is 188– Max is 485– 10 are over 400

• Nodes– 1629 nodes– Average 22 volumes / node– Max is 63– 4 are over 50

Page 19: IMPLEMENTING COLLOCATION GROUPS #1tsm-symposium.oucs.ox.ac.uk/2005/papers/Implementing...IMPLEMENTING COLLOCATION GROUPS #2 About Draper Lab • An independent, not-for-profit corporation

IMPLEMENTING COLLOCATION GROUPS #19

Summary

• Match your process to your resources– Does your disk write speed match your

tape read speed?• The more groups you have, the longer a tape

to tape move or reclaim will take.• Do 2 processes?

– Few cg's on a tape, do tape to tape.– Lots of cg's on a tape, do tape to file.