From SVN to Git

From

SVN to

Git

Note / disclaimer: these slides may be shared asa PDF for the sake of interoperability, but they are

intended to be seen as an animation. It is recommendedto disable the “Continuous View” feature of your PDF reader so you can see one slide at a time.

So you are familiar with SVN

So you are familiar with SVN(or maybe even CVS)

and want to learn about Git?

Very good!

First thing you need to know is that

SVN is a centralisedversion control system

(so is CVS and others)

while

Git is a distributedversion control system

(so is Mercurial and others)

The second thing you need to know is that

● centralised is– older

– more common (for now)

● distributed is– newer

– better

It is easy to verify that centralised is more common

and that distributed is more modern, so we will not talk about that.

We will explain why distributedis also better in a second,

but there is a third thing you must know

just in case you are looking at theseslides trying to figure out whether to

learn one or the other.

Learning one is as easy as learning the other as long as

you do not know anything of either.

If you already know about distributedversion control systems, changing to

centralised is easy.

If you already know about distributedversion control systems, changing to

centralised is easy.The opposite is not true.

So if you have never used a versioncontrol system, close these slides,

learn about Git, and move on.

But you, reader who are already familiar with SVN,

please continue.

Let me convince you to change to Git

Let me convince you to change to Git

(or Mercurial, or any distributed VCS)

and never look back.

and never look back.

Ready?

Let's start by looking at how SVN works

In SVN there is a server...

...and several clients

...and several clients

(in between, the network)

The server is the centralised source of truth

The source code is initialised at the server

and then clients can get it.

svn checkout

After making a change, clientscommit to the server

svn commit

Other clients can check out too...

svn checkout

If a client has made changes thatare not on the server,

svn update

they must update before committing.

svn commit: FAIL


svn update


svn commit: SUCCESS

At update time, if the code on the server cannot be merged with the client code

svn update

there is a conflict

svn update: CONFLICT

that must be solved before committing.

svn commit: FAIL because CONFLICT

This apparently simple explanationof SVN is enough to highlight

its three main limitations.

● SVN...– ...is slow

– ...cannot work offline

– ...does not scale

● And on top of all that– ...branches are a nightmare

SLOW: evident, every commit goes through the network

svn commit

● Being slow is not only a nuisance● It provides a unconsciously powerful

incentive to use version control sparingly instead of profusely (as it should)

● “I just changed a line, there is no need to commit yet”

● “I will fix this other thing first and then commit all”

● Wrong: commits should be atomic● Remember that the main use of version

control is finding bugs retroactively. The smaller the commits, the better.

ONLY ONLINE: evident,all goes through the network

svn commit svn update -r 32144

● But the network is not always there!– Planes, mountain retreats, 3G/4G

failures, airports, bad hotels...

● What do you do when you do not have network connectivity?

– Do you stop working?

– Or do you stop using version control and then commit a huge 200-lines commit fixing five bugs and adding two features?

● But the network is not always there!– Planes, mountain retreats, 3G/4G

failures, airports, bad hotels...

● What do you do when you do not have network connectivity?

– Do you stop working?

– Or do you stop using version control and then commit a huge 200-lines commit fixing five bugs and adding two features?

– Both options are wrong!

NO SCALABILITY: this is not evident...

svn commit svn update

...but we have seen that every timethere is a commit on the system...

svn commit svn update

...there is a chance of creating a conflictwith someone else.

svn commit svn update: CONFLICT

This is quite annoyingand breaks the flow of work.


In a project of 3 programmers,conflict happens sometimes.


In a project of 30 programmers,conflict happens often.

svn commitsvn update: CONFLICT

In a project of 300 programmers,conflict happens all the time to everybody.

svn commit

svn update: CONFLICT

Additionally, creating and merging branches in SVN is a nightmare.

Branches are important in projectwhere not all the code is visible

all the time

experimental features

code not mature

different features for different clients

All of that is quite difficult in SVN.

Let's see how Git worksto solve all these problems!

Git is not centralised but distributed

This means that there is not one server...

...but many

Actually, there is one per computer.

In Git, every computer is a server

and version control happens locally...

● ...which means version control is...– fast,

– works offline,

– and scales!

● It also means that setting a new git repo is trivial!

● To start working with SVN you need to put up a server or to have access to a server. In git you just say 'git init' and you are ready to go!

FAST: commits are local, so theyare instantaneus and cheap

git commit

OFFLINE: operations are local,no network needed

git commit git branch

OFFLINE: operations are local,no network needed

(only for sharing with others... we will come to that in a second)

git commit git branch

Let's talk about sharing before wetalk about scalability

How do your share your codewith your project mates in Git

...?

How do your share your codewith your project mates in Git

...if commits are local?

You pull their commits from them(and so do they with yours)

git pull

git pull

git pull

When you pull from someone, youdo two things: you checkout their

commits, and then you merge themwith your code.

git pull = git checkout + git merge

git pull = git checkout + git merge

Beware! This “git checkout” does not mean exactly thesame as “svn checkout”, it only means “get all the commits from that machine”, it does not start a localrepository as in SVN.

Most of the time, most people just pull(instead of first checking out and then merging)

This distributed approach works well, and scales perfectly, but

there is still one small problem...

...how do I know where is your machine?Do I need to remember your IP address

or DNS name (if you have one)?

That is why most people use a public repository where they make their changes public.

(GitHub is just one of the most well-known public repository holders)

In the same way that you can pullsomeone else's repository...

...you can push your local changesinto a remote repository(as long as it is yours,

i.e. you have permission)

and then others can pull from thatpublic repository of yours, because

it is at a well-known location.

So the usual picture looks like this:GitHub

git commit


git push


git pull


git commit git commit


git push git push

note that the first programmerdoes not pull from the fourth

GitHub

git pull

maybe they do not know eachother, or they do not trust each other

GitHub

git pull

but maybe programmer 2trusts programmer 4

GitHub

git pull

but maybe programmer 2trusts programmer 4

GitHub

git push

and pr1 trusts pr2 to get things right

GitHub

git pull

solving all conflicts that untrusted (for pr1) programmers may cause

GitHub

git pull

Its distributed nature is what makes Git so scalable.

Is SVN you must trust everybody


because everybody can mess up everybody else


because everybody can mess up everybody else(e.g. committing code


because everybody can mess up everybody else(e.g. committing code

but forgetting that new fileand then nothing compiles)

In Git you must only trustyour “selected few”, the few

people you pull from.

There are thousands of peopleworking on the linux kernel

but Linus Torvalds only pullsfrom a handful of them.

If someone messes up, only thosein their circle are affected,

and the issue is fixed before it spreads

to the whole community.

● The only way of doing this in SVN is by not using SVN

– Some people have access to the server and some do not

– Code reviews are performed out-of-the-system before committing is allowed

– Political battles for access to the server

– etc

In Git, everything is under versioncontrol all the time.

Nothing is ever lost.

I will repeat this because it is important: “Nothing is ever lost”

On Git every single machine (potentially) hasthe whole history of the project.

This is total security againstdata loss

Your machine burnt in a fire?Stolen? Eaten by your dog?

No problem! Just clone from somebody and

you have everything again!(except those things you had not pushed today, of course)

In centralised systems like SVNthe server is a central point

of failure. If you lose the server,you lose all your history, your

branches, all your version control.

Sure, having the whole history on every single repo uses a lot of disk space!

But when was the last time youfilled up a hard disk by writing code?

(not downloading movies) ;-)

In summary...

● Distributed version control is great because:

– ...

In summary...


– It is fast

In summary...


– It is fast

– It is local (no network)

In summary...


– It is fast


– It scales seamlessly

In summary...


– It is fast



– It is safer

In summary...


– It is fast



– It is safer● And if that were not enough to never look at

SVN again, Git also:

In summary...


– It is fast




SVN again, Git also:– Makes creating, managing, and merging branches

very easy

In summary...


– It is fast




SVN again, Git also:– Makes creating, managing, and merging branches

very easy

– Makes repository creation trivial

Quick SVN->Git translation

Let's see quickly what are the commands for the most common

operations in SVN and Git


● Getting a copy of a repository– In SVN, 'svn checkout'

– In Git, 'git clone'– No big differences here


● Committing new code– In SVN, 'svn commit'

– In Git, 'git commit'– But we have seen that git's commit is

● local● very fast


● Committing new code publicly– In SVN, 'svn commit'

– In Git, 'git push'– Commits are local in Git, and that (usually) means non-visible

– In other to publish them to other team members, they have to be pushed to a well-known location

– Pushing is slow, like svn-commit, but it only happens once or twice per day, typically.

– In SVN, your commit reaches everyone. In Git it reaches only people who trust you enough to pull from you.


● Getting code written by others– In SVN, 'svn update'

– In Git, 'git pull'– No big differences here either, but

● in SVN, you always pull from the server● in Git, you can pull from different people's (from their

machines, from their GitHub's account, or similar)


● Most other commands have similar syntaxes to SVN's

– There are also a lot of new commands, but clone + commit + + push + pull will be 95% of your use of Git

– Remember: 'git help <command>' is always your friend

Happy hacking!

Software

From SVN to Git