Lj Article

Embed Size (px)

Citation preview

  • 8/2/2019 Lj Article

    1/6

    TheSubversionProject:BuildingaBetterCVS==============================================

    BenCollins-SussmanWritteninAugust2001PublishedinLinuxJournal,January2002

    Abstract--------

    Thisarticlediscussesthehistory,goals,featuresanddesignofSubversion(http://subversion.tigris.org),anopen-sourceprojectthataimstoproduceacompellingreplacementforCVS.

    Introduction------------

    Ifyouwor onany indofopen-sourceproject,you'veprobablywor edwithCVS.Youprobablyrememberthefirsttimeyoulearnedtodoananonymouschec outofasourcetreeoverthenet--oryourfirstcommit,orlearninghowtoloo atCVSdiffs.Andthenthefateful

    daycame:youas

    edyourfriendhowtorenameafile.

    "Youcan't",wasthereply.

    What?Whatdoyoumean?

    "Well,youcandeletethefilefromtherepositoryandthenre-additunderanewname."

    Yes,butthennobodywould nowithadbeenrenamed...

    "Let'scalltheCVSadministrator.Shecanhand-edittherepository'sRCSfilesforusandpossiblyma ethingswor ."

    What?

    "Andbytheway,don'ttrytodeleteadirectoryeither."

    Yourolledyoureyesandgroaned.Howcouldsuchsimpletas sbedifficult?

    TheLegacyofCVS-----------------

    Nodoubtaboutit,CVShasevolvedintothestandardSoftware

    ConfigurationManagement(SCM)systemoftheopensourcecommunity.Andrightlyso!CVSitselfisFreesoftware,anditswonderful"nonloc

    ing"developmentmodel--wherebydozensoffar-flungprogrammerscollaborate--fitstheopen-sourceworldverywell.Infact,onemightarguethatwithoutCVS,it'sdoubtfulwhethersitesli eFreshmeatorSourceforgewouldeverhaveflourishedastheydonow.CVSanditssemi-chaoticdevelopmentmodelhavebecomeanessentialpartofopensourceculture.

    Sowhat'swrongwithCVS?

  • 8/2/2019 Lj Article

    2/6

    BecauseitusestheRCSstorage-systemunderthehood,CVScanonlytrac filecontents,nottreestructures.Asaresult,theuserhasnowaytocopy,move,orrenameitemswithoutlosinghistory.Treerearrangementsarealwaysuglyserver-sidetwea s.

    TheRCSbac -endcannotstorebinaryfilesefficiently,andbranchingandtaggingoperationscangrowtobeveryslow.CVSalsousesthenetwor inefficiently;manyusersareannoyedbylongwaits,becausefiledifferecesaresentinonlyonedirection(fromservertoclient,butnotfromclienttoserver),andbinaryfilesarealwaystransmittedintheirentirety.

    Fromadeveloper'sstandpoint,theCVScodebaseistheresultoflayersuponlayersofhistorical"hac s".(RememberthatCVSbeganlifeasacollectionofshell-scriptstodriveRCS.)Thisma

    esthecodedifficulttounderstand,maintain,orextend.Forexample:CVS'snetwor ingabilitywasessentially"stapledon".Itwasneverdesignedtobeanativeclient-serversystem.

    RectifyingCVS'sproblemsisahugetas --andwe'veonlylistedjustafewofthemanycommoncomplaintshere.

    EnterSubversion----------------

    In1995,KarlFogelandJimBlandyfoundedCyclicSoftware,acompanyforcommerciallysupportingandimprovingCVS.Cyclicmadethefirstpublicreleaseofanetwor -enabledCVS(contributedbyCygnussoftware.)In1999,KarlFogelpublishedaboo aboutCVSandtheopen-sourcedevelopmentmodelitenables(cvsboo .red-bean.com).KarlandJimhadlongtal edaboutwritingareplacementforCVS;Jimhadevendraftedanew,theoreticalrepositorydesign.Finally,inFebruaryof2000,BrianBehlendorfofCollabNet(www.collab.net)offeredKarlafull-timejobtowriteaCVSreplacement.Karlgatheredateamtogetherandwor beganinMay.

    Theteamsettledonafewsimplegoals:itwasdecidedthatSubversionwouldbedesignedasafunctionalreplacementforCVS.ItwoulddoeverythingthatCVSdoes--preservingthesamedevelopmentmodelwhilefixingtheflawsinCVS's(lac -of)design.ExistingCVSuserswouldbethetargetaudience:anyCVSusershouldbeabletostartusingSubversionwithlittleeffort.AnyotherSCM"bonusfeatures"weredecidedtobeofsecondaryimportance(atleastbeforea1.0release.)

    Atthetimeofwriting,theoriginalteamhasbeencodingforalittleoverayear,andwehaveanumberofexcellentvolunteercontributors.(Subversion,li eCVS,isaopen-sourceproject!)

    Subversion'sFeatures----------------------

    Here'saquic run-downofsomeofthereasonsyoushouldbeexcitedaboutSubversion:

    *Realcopiesandrenames.TheSubversionrepositorydoesn'tuseRCSfilesatall;instead,itimplementsa'virtual'versioned

  • 8/2/2019 Lj Article

    3/6

    filesystemthattrac stree-structuresovertime(describedbelow).Files*and*directoriesareversioned.Atlast,therearerealclient-side`mv'and`cp'commandsthatbehavejustasyouthin .

    *Atomiccommits.Acommiteithergoesintotherepositorycompletely,ornotall.

    *Advancednetwor layer.TheSubversionnetwor serverisApache,andclientandserverspea

    WebDAV(2)tooneanother.(Seethe'design'sectionbelow.)

    *Fasternetwor access.Abinarydiffingalgorithmisusedtostoreandtransmitdeltasin*both*directions,regardlessofwhetherafileisoftextorbinarytype.

    *Filesystem"properties".Eachfileordirectoryhasaninvisiblehashtableattached.Youcaninventandstoreanyarbitrary ey/valuepairsyouwish:owner,perms,icons,app-creator,mime-type,personalnotes,etc.Thisisageneral-purposefeatureforusers.Propertiesareversioned,justli efilecontents.Andsomepropertiesareauto-detected,li ethemime-typeofafile(nomorerememberingtousethe'- b'switch!)

    *Extensibleandhac

    able.Subversionhasnohistoricalbaggage;itwasdesignedandthenimplementedasacollectionofsharedClibrarieswithwell-definedAPIs.Thisma esSubversionextremelymaintainableandusablebyotherapplicationsandlanguages.

    *Easymigration.TheSubversioncommand-lineclientisverysimilartoCVS;thedevelopmentmodelisthesame,soCVSusersshouldhavelittletroublema ingtheswitch.Developmentofa'cvs2svn'repositoryconverterisinprogress.

    *It'sFree.SubversionisreleasedunderaApache/BSD-styleopen-sourcelicense.

    Subversion'sDesign-------------------

    Subversionhasamodulardesign;it'simplementedasacollectionofClibraries.Eachlayerhasawell-definedpurposeandinterface.Ingeneral,codeflowbeginsatthetopofthediagramandflows"downward"--eachlayerprovidesaninterfacetothelayeraboveit.

    Let'sta eashorttouroftheselayers,startingatthebottom.

    -->TheSubversionfilesystem.

    TheSubversionFilesystemis*not*a ernel-levelfilesystemthatonewouldinstallinanoperatingsystem(li etheLinuxext2fs.)Instead,itreferstothedesignofSubversion'srepository.Therepositoryisbuiltontopofadatabase--currentlyBer eleyDB--andthusisacollectionof.dbfiles.However,alibraryaccessesthesefilesandexportsaCAPIthatsimulatesafilesystem--

  • 8/2/2019 Lj Article

    4/6

    specifically,a"versioned"filesystem.

    Thismeansthatwritingaprogramtoaccesstherepositoryisli ewritingagainstotherfilesystemAPIs:youcanopenfilesanddirectoriesforreadingandwritingasusual.Themaindifferenceisthatthisparticularfilesystemneverlosesdatawhenwrittento;oldversionsoffilesanddirectoriesarealwayssavedashistoricalartifacts.

    WhereasCVS'sbac

    end(RCS)storesrevisionnumbersonaper-filebasis,Subversionnumbersentiretrees.Eachatomic'commit'totherepositorycreatesacompletelynewfilesystemtree,andisindividuallylabeledwithasingle,globalrevisionnumber.Filesanddirectorieswhichhavechangedarerewritten(andolderversionsarebac edupandstoredasdifferencesagainstthelatestversion),whileunchangedentriesarepointedtoviaashared-storagemechanism.Thisishowtherepositoryisabletoversiontreestructures,notjustfilecontents.

    Finally,itshouldbementionedthatusingadatabaseli eBer eleyDBimmediatelyprovidesothernicefeaturesthatSubversionneeds:dataintegrity,atomicwrites,recoverability,andhotbac ups.(Seewww.sleepycat.comformoreinformation.)

    -->Thenetwor layer.

    Subversionhasthemar ofApachealloverit.Atitsverycore,theclientusestheApachePortableRuntime(APR)library.(Infact,thismeansthatSubversionclientshouldcompileandrunanywhereApachehttpddoes--rightnow,thislistincludesallflavorsofUnix,Win32,BeOS,OS/2,MacOSX,andpossiblyNetware.)

    However,SubversiondependsonmorethanjustAPR--theSubversion"server"isApachehttpditself.

    WhywasApachechosen?Ultimately,thedecisionwasaboutnot

    reinventingthewheel.Apacheisatime-tested,open-sourceserverprocessthatreadyforserioususe,yetisstillextensible.Itcansustainahighnetwor load.Itrunsonmanyplatformsandcanoperatethroughfirewalls.It'sabletouseanumberofdifferentauthenticationprotocols.Itcandonetwor pipeliningandcaching.ByusingApacheasaserver,Subversiongetsallthesefeaturesforfree.Whystartfromscratch?

    SubversionusesWebDAVasitsnetwor protocol.DAV(DistributedAuthoringandVersioning)isawholediscussioninitself(seewww.webdav.org)--butinshort,it'sanextensiontoHTTPthatallowsreads/writesand"versioning"offilesovertheweb.TheSubversionprojectishopingtorideaslowlyrisingtideofsupportforthis

    protocol:allofthelatestfile-browsersforWin32,MacOS,andGNOMEspea thisprotocolalready.Interoperabilitywill(hopefully)becomemoreandmoreofabonusovertime.

    ForuserswhosimplywishtoaccessSubversionrepositoriesonlocaldis ,theclientcandothistoo;nonetwor isrequired.The"RepositoryAccess"layer(RA)isanabstractAPIimplementedbyboththeDAVandlocal-accessRAlibraries.Thisisaspecificbenefitofwritinga"librarized"versioncontrolsystem;it'sabigwinoverCVS,whichhastwoverydifferent,difficult-to-maintaincodepathsfor

  • 8/2/2019 Lj Article

    5/6

    localvs.networ repository-access.Feelli ewritinganewnetworprotocolforSubversion?JustwriteanewlibrarythatimplementstheRAAPI!

    -->Theclientlibraries.

    Ontheclientside,theSubversion"wor

    ingcopy"librarymaintainsadministrativeinformationwithinspecialSVN/subdirectories,similarinpurposetotheCVS/administrativedirectoriesfoundinCVSwor

    ingcopies.

    AglanceinsidethetypicalSVN/directoryturnsupabitmorethanusual,however.The`entries'filecontainsXMLwhichdescribesthecurrentstateofthewor ingcopydirectory(andwhichbasicallyservesthepurposesofCVS'sEntries,Root,andRepositoryfilescombined).Butotheritemspresent(andnotfoundinCVS/)includestoragelocationsfortheversioned"properties"(themetadatamentionedin'SubversionFeatures'above)andprivatecachesofpristineversionsofeachfile.Thislatterfeatureprovidestheabilitytoreportlocalmodifications--anddoreversions--*without*networ access.AuthenticationdataisalsostoredwithinSVN/,ratherthaninasingle.cvspass-li efile.

    TheSubversion"client"libraryhasthebroadestresponsibility;itsjobistominglethefunctionalityofthewor ing-copylibrarywiththatoftherepository-accesslibrary,andthentoprovideahighest-levelAPItoanyapplicationthatwishestoperformgeneralversioncontrolactions.

    Forexample:theCroutine`svn_client_chec out()'ta esaURLasanargument.ItpassesthisURLtotherepository-accesslibraryandopensanauthenticatedsessionwithaparticularrepository.Itthenas stherepositoryforacertaintree,andsendsthistreeintothewor ing-copylibrary,whichthenwritesafullwor ingcopytodis(SVN/directoriesandall.)

    Theclientlibraryisdesignedtobeusedbyanyapplication.WhiletheSubversionsourcecodeincludesastandardcommand-lineclient,itshouldbeveryeasytowriteanynumberofGUIclientsontopoftheclientlibrary.Hopefully,theseGUIsshouldsomedayprovetobemuchbetterthanthecurrentcropofCVSGUIapplications(themajorityofwhicharenomorethanfragile"wrappers"aroundtheCVScommand-lineclient.)

    Inaddition,properSWIGbindings(www.swig.org)shouldma etheSubversionAPIavailabletoanynumberoflanguages:java,perl,python,guile,andsoon.InordertoSubvertCVS,ithelpstobeubiquitous!

    Subversion'sFuture-------------------

    ThereleaseofSubversion1.0iscurrentlyplannedforearly2002.Afterthereleaseof1.0,Subversionisslatedforadditionssuchasi18nsupport,"intelligent"merging,better"changeset"manipulation,client-sideplugins,andimprovedfeaturesforserveradministration.(Alsoonthewishlistisaneclecticcollectionofideas,suchasdistributed,replicatingrepositories.)

  • 8/2/2019 Lj Article

    6/6

    AfinalthoughtfromSubversion'sFAQ:

    "Wearen't(yet)attemptingtobrea newgroundinSCMsystems,norareweattemptingtoimitateallthebestfeaturesofeverySCMsystemoutthere.We'retryingtoreplaceCVS."

    If,inthreeyears,Subversioniswidelypresumedtobethe"standard"SCMsystemintheopen-sourcecommunity,thentheprojectwillhavesucceeded.Butthefutureisstillhazy:ultimately,Subversionwillhavetowinthispositiononitsowntechnicalmerits.

    Patchesarewelcome.

    ForMoreInformation--------------------

    PleasevisittheSubversionprojectwebsiteathttp://subversion.tigris.org.Therearediscussionliststojoin,andthesourcecodeisavailableviaanonymousCVS--andsoonthroughSubversionitself.