28
Protecting Your Data: Backups, Archives & Data Preservation DataONE Community Engagement & Outreach Working Group

Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

ProtectingYourData:Backups,Archives&DataPreservation

DataONECommunityEngagement&OutreachWorkingGroup

Page 2: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

KeyDigitalPreservationConceptsBackups:ThingstoConsiderDataPreservationRecommendedPractices

LessonTopics

Page 3: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

DefinethedifferencesbetweenbackupsandarchivingdataIdentifysignificantissuesrelatedtodatabackupsIdentifywhybackupplansareimportantandhowtheycanfitintolargerbackupproceduresDiscusswhatdatapreservationcoversListseveralrecommendedpractices

LearningObjectivesAftercompletingthislesson,theparticipantwillbeableto:

Page 4: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

TheDataONEDataLifeCycle

Page 5: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

DifferencesataGlance

DataProtectionIncludestopicssuchas:backups,archives,&preservation;alsoincludesphysicalsecurity,encryption,andothersnotaddressedhereMoreinformationaboutthesetopicscanbefoundinthe“References”section

DataProtection,Backups,Archiving,Preservation

Page 6: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Terms“backups”and“archives”areoftenusedinterchangeably,butdohavedifferentmeanings

Backups:copiesoftheoriginalfilearemadebeforetheoriginalisoverwrittenArchives:preservationofthefile

DataPreservationIncludesarchivinginadditiontoprocessessuchasdatarescue,datareformatting,dataconversion,metadata

DataProtection,Backups,Archiving,Preservation(continued)

Page 7: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

BackupsUsedtotakeperiodicsnapshotsofdataincasethecurrentversionisdestroyedorlostBackupsarecopiesoffilesstoredforshortornear-long-termOftenperformedonasomewhatfrequentschedule

ACloserLook:Backupsvs.Archiving

Page 8: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Savetime,money,productivityHelppreparefordisasters

AccidentaldeletionsFires,naturaldisastersSoftwarebugs,hardwarefailures

Reproduceresultsofpastprocedures(iftheywerebasedonolderfiles)RespondtodatarequestsLimitliability

WhyPerformBackups?Limitlossofdata,someofwhichmaynotbereproducible

Page 9: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Whataretheexistingpoliciesthatmightaffecthowandwhenyoudodatabackups?

Maybeseparateproject,office,department,fundingsource,ororganizationalpolicesPoliciesmaydifferbetweengroups;whichhasprecedence?Arebackupsalreadypartofalargerdatamanagementorcontingencyplanforyourgroup?

Whoisresponsibleforperformingbackups?Users?Systemadministrators?Both?

Dothesevariouspoliciesfityourneeds?

Backups:ThingstoConsider

Page 10: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Howoftenshouldyoudobackupstocapturesignificantchange?CostversusbenefitContinually?Daily?Weekly?Monthly?

Whatkindofbackupsshouldyouperform?Partial:backinguponlythosefilesthathavechangedsincethelastbackupFull:backing-upallfilesHowoftenandwhatkindwilldependuponwhatkindofdatayouhaveandhowuniqueitis

Whataboutnon-digitalfiles(suchaspapers)?Considerdigitizingfiles

Backups:ThingstoConsider(continued)

Page 11: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Wherewillyoubackupyourfiles?Maydependuponprojectrequirements,etc.Personalexternaldisk,centralizedcomputerstorage(Dropbox),“cloud”storage(Amazon,Google)

CDsandDVDs,whilecheapandconvenient,arenotgoodmediaforbackups

Whatmetadataisneededwhenusingthesesystems?Arethefilesbackedupindividuallyorasonelargefile?Considerthatnotallbackupsmaybeimmediatelyavailable,dependingonhowthefilesarepackagedGoodpracticetokeepbackupsindifferentlocationthansourcedataIfadisasterstrikes,itcandestroybothversionsofdata

Backups:ThingstoConsider(continued)

Page 12: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Howarebackupscarriedout?Manuallymayworkforsinglefiles,butrequiresthattheuserrememberstoperformregularbackupsandcanbetime-consumingAutomatedbackupscanberunonasetschedulethatdoesn’trequiretheusertoremember

WhatdoIdoifIneedtogetafilefrombackups?BackupmodemaydeterminehowthefilecanberetrievedYoushouldknowhowtoobtainfilesfrombackups,wheretheyarelocated,andwhotocontactYouneedtoknowthisinformationbeforehand,asoftenyouneedafilefromabackupinanemergency!

Understandingthebackupprocessispartofcreatinggooddatamanagementpractices

Considerations

Page 13: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Howdoyouverifyabackuphasbeensuccessfullyperformed?Mostbackupsoftwarewillhavealogfilethatcontainsdetailsofthebackup(whichfiles,whenthebackupwascreated)However,don’trelysolelyonthelogfileEvenifalogfilestatesthebackupwassuccessful,youstillneedtocheckthebackuptomakesurethefilesarethereandaccessibleTestbytryingtopullafileofffrombackupandrestoreittoanotherlocationHardwareandsoftwarefailurescanhappenafterbackupsandlogfilesaremadeMakesureyoursystemisbackingupthecorrectfiles

Considerations

Page 14: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Ifyouareworkingwithsomeone,suchasanITgroup,whohelpsmanageandperformbackups,confirmandverifythatthebackupprocesshasbeensuccessfullycompletedHowdoyouverifyabackuphasbeensuccessfullyperformed?

Sincemanualchecksofallofthefilesinyourbackupisprobablynotpossible,youshouldutilizeothermethodssuchascheckingfilesizes,datestamps,andchecksumvalues.Checksumaremathematicalcalculationsbaseduponaspecificfile.Ifthecalculatedchecksumsmatchbetweenthebackupcopyandtheoriginalfile,chancesarethefileisthesameandwasnotmodifiedwhencopiedorstored.

Considerations

Page 15: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Aretherebackupsofthebackups?Necessaryforhigh-valuedataUsuallydifferentcopiesofbackupsarekeptindifferentlocations

Howlongdoyoukeepyourbackups?Dependsuponspecificsituation,andshouldbedeterminedinconcertwithstakeholdersandresourcemanagersUnderstandrelevantguidelines,policiesandrulesforretentionofdata

Whatarethelongtermstorageandaccesssolutionsthatarerelevantfortheproject?Whattodowhenfundingendsorkeystaffdepart?

Changesinthestatusoftheproject,funding,orkeystaffareimportantreasonstohaveafullunderstandingofrelatedoptionsandrequirementsforstorageandaccess

Considerations

Page 16: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Adesignfirmwashandlingtheirownbackups.Thesystemwasworkingandthebackupsoftwarewasreportingthatthedatawassuccessfullybackedup.Theadministratorcheckedthebackupsimmediatelyaftertheyweredoneandconfirmedtheyweregood.

DatainRealLife

Page 17: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

DatainRealLifeAfteracomputerviruserasedmostoftheirfiles,theywentbacktotheirbackups.Unfortunatelytheyfoundthatthebackupswereallblankandallofthedatawasgone.Onlyaftersomeinvestigationdidtheydiscoverthatthecomputertapes(whichcontainedthebackups)wereplacedagainstawallthathadanelevatorontheothersideofit.Whentheelevatorwentpast,themagnetsinsideerasedallofthetapes.

Hadtheycheckedtheirbackupsproperly,theyprobablywouldhavenoticedthisbeforetherewasanemergency

Page 18: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Canyoureaddatafromolderbackups?Mediachanges.Youmaynolongerbeabletoreadolderversionsandformatssuchasfloppydisks,JazzandZipdrives,WordPerfectfiles,etc.

Mediacandegradequickly,unexpectedly,inconsistentlyEvenifyoucanopenafiletoday,thatdoesn’tmeanyoucaninamonthfromnow

Howwillyoudisposeofoutdateddata?Makedecisiontocopy,archiveRemember:backupthedatayoucan’taffordtolose!

FinalConsiderations

Page 19: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Bymanagingandpreservingyourdatawell,datarescuemaynotbenecessary.Why?

Additionofrelevantmetadata,properfilenaming(canhelpthefilefromgettinglostinthesystem),utilizationofproperfileformats(letsyouopenthefilewithouthavingtoconvertthefile),backups(limitslossoffiles),andmediatypes(limitsdegradationoffiles),youmaylimitorpreventtheneedfordatarescue.

Agooddatamanagementplanisanothertooltohelplimittheneedfordatarescue.

DataPreservation

Page 20: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Includesbackupsandarchivinginadditiontoprocessessuchasdataconversion,datareformatting,anddatarescue

Olderfilesmaynolongerbeinausableformatandmayrequireconversionor“rescue”beforethedatacanbeused.Datareformatting,conversion,andbackupbecomesevenmoreimportantasprojectsfinishupand/orarenolongerfunded.Datamayhavebeenkeptattheendoftheproject,butifnooneismanagingthedata,datamaybeleftinformatsthatarenolongerusableorinlocationsthatarenolongeraccessible.

Additionally,datapreservationrequiresplanning,structure,andongoingmanagementandassessment

ProcessesRelatedtoDataPreservation

Page 21: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Createuseful,relevantmetadataDataConversionsandFormats

Usenon-proprietary,standardformatsConverttextfilesfrom.docor.xlsto.txt,imagefilesto.tiffor.pdfBesuretocheckfilesafterconvertingthem,asdata,metadata,andformattinglosscanoccur

VersioningUseconsecutivenumbersandletterstohelpkeeptrackofchangestoafilethroughoutvariouseditsandrevisions.Thiswillhelpyouquicklydifferentiatebetweenfileswithsimilarnames.

FileNamingUsefilenamesthatareconsistent,descriptive,andconcisesothatyoucanfindandquicklyidentifythefilethefileatalatertime.Renamefilesthathaveadefaultfilenamewhenexportedsuchas“image.jpg”or“archive.zip”

PreservationFormatsandVersionStrategies

Page 22: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Createapreservationpolicythatclearlyidentifies:rolesresponsibilitieswherethedataisbackeduphowoftenthefilesarebackeduphowtoaccessthefilesrecommendedfileformatstobeusedpoliciesformigratingdatatoassuredataarenotlostduetomediadegradationorchangingformatsorprograms

Reviewyourpreservationpolicyandplanperiodicallytoensureitisstillvalidandapplicable

RecommendedPractices

Page 23: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Minimizeorremoverelianceonuserstoperformownmanualbackups(ifpossible)

ImplementstandardizedandautomaticbackupsIfpossible,putexpertsinchargeofthistask(computerstaff)astheyaremorelikelytokeepup-to-dateregardingsoftwareupdates,hardwareissues,bestpractices,etc.

Don’tassumebackupsarebeingperformedforyouYoudon’twanttofindoutafterthefactthatnobackupshavebeenperformedIfyouareusingthird-partysoftware(likeYahooorGoogleMail),whathappensiftheyloseyourfiles?

Usenon-proprietary,standardformatsConverttextfilesfrom.docor.xlsto.txt,imagefilesto.tiff,or.pdf

RecommendedPractices(continued)

Page 24: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

CheckyourbackupsmanuallyStartwithlogfiles,astheymaytellyouthebackupwasunsuccessfulDonotrelysolelyonthelogfiles–theymaybeincorrectorthedatamayhavebecomecorruptedafterthefilewastransferredLookatfiledatesandfilesizestoseeiftheymatch;calculateachecksumontheoriginalandarchivedfileandmakesuretheymatchEnsureyoucanreadfilesoffofolderbackupsandarchives.

HavemultipleversionsofbackupsonmultipleformatsinmultipleplacesGooddatamanagementwilllimittheamountofdatarescuethatneedstobeperformedonolderdata

RecommendedPractices(continued)

Page 25: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

DatainRealLifeIn2011,asoftwarebugcausedsomeGmailuserstoloseaccesstotheiremail.Fortunately,Googlehadbackups!

Page 26: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

Datapreservationismorethanjustbackingupandarchivingyourfilesorganizationalinfrastructure,technologicalsituation,resources

Whendevisingapreservationstrategy,oneneedstoconsiderhowoftentoperformbackups,wheretobackup,accessibilitytobackupsandhowlongtokeepthefilesTherearemanyreasonsweneedtoperformbackups,primarilytopreventdatalossCheckforbackupsonoutdatedmediaandtestyourbackupsoften!

Summary

Page 27: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

References1. StanfordUniversityLibraries,DataManagementPlans,(StanfordUniversity

Libraries),https://library.stanford.edu/research/data-management-services,(accessed9/21/2016)

2. Albanesius,Chloe,Google:Storagesoftwareupdateledtoe-mailbug,http://www.pcmag.com/article2/0,2817,2381168,00.asp(accessed09/21/2016)

3. VandenEynden,Veerle,Corti,Louise,Woollard,Matthew,Bishop,LibbyandHorton,Laurence,ManagingandSharingData,http://www.data-archive.ac.uk/media/2894/managingsharing.pdf,andcompanionmaterials,https://www.ukdataservice.ac.uk/manage-data/handbook(accessed09/21/2016)

Formoreinformationaboutphysicalsecurity,encryption,anddatadisposal,visit:http://www.data-archive.ac.uk/media/2894/managingsharing.pdf

Page 28: Protecting Your Data: Backups, Archives & Data Preservation · Data preservation is more than just backing up and archiving your files organizational infrastructure, technological

AboutParticipateinourGitHubrepo:https://dataoneorg.github.io/dataone_lessons/

Thefullslidedeck(inPowerPoint)maybedownloadedfrom:http://www.dataone.org/education-modules

Suggestedcitation:DataONEEducationModule:DataManagement.DataONE.RetrievedNovember12,2016.Fromhttp://www.dataone.org/sites/all/documents/L01_DataManagement.pptx

Copyrightlicenseinformation:Norightsreserved;youmayenhanceandreuseforyourownpurposes.WedoaskthatyouprovideappropriatecitationandattributiontoDataONE.