Introduction to git

Preview:

DESCRIPTION

These are the slides used in http://vimeo.com/35778382

Citation preview

Git: a brief introduction

Randal L. Schwartz, merlyn@stonehenge.comVersion 4.0.6 on 5 Jan 2012

This document is copyright 2011, 2012 by Randal L. Schwartz, Stonehenge Consulting Services, Inc.This work is licensed under Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License

http://creativecommons.org/licenses/by-nc-sa/3.0/

1Monday, February 6, 12

About me• Been tracking git since it was created• Used git on small projects• Used other systems on small and large projects• Read a lot of people talk about git on the

mailing list• Provided some patches to git, and suggestions

for user interface changes• Worked on small and medium teams with git• But not large ones

2Monday, February 6, 12

What is git?• Git manages changes to a tree of files over time• Git is optimized for:• Distributed development• Large file counts• Complex merges• Making trial branches• Being very fast• Being robust

3Monday, February 6, 12

But not for...

• Tracking file permissions and ownership• Tracking individual files with separate history• Making things painful

4Monday, February 6, 12

Why git?

• Essential to Linux kernel development• Created as a replacement when BitKeeper

suddenly became “unavailable”• Now used by thousands of projects• Everybody has a “commit bit”

5Monday, February 6, 12

Everyone can...

• Clone the tree• Make and test local changes• Submit the changes as patches via mail• OR submit them as a published repository• Track the upstream to revise if needed

6Monday, February 6, 12

How does git do it?• Universal public identifiers• None of the SVK “my @245 is your @992”

• Multi-protocol transport: HTTP, SSH, GIT• Efficient object storage• Everyone has entire repo (disk is cheap)

• Easy branching and merging• Common ancestors are computable

• Patches (and repo updates) can be transported or mailed

• Binary “patches” are supported

7Monday, February 6, 12

The SHA1 is King• Every “object” has a SHA1 to uniquely identify it• “objects” consist of:• Blobs (the contents of a file)• Trees (directories of blobs or other trees)• Commits:• A tree• Plus zero or more parent commits• Plus a message about why

• And tags

8Monday, February 6, 12

Tags

• An object (usually a commit)• Plus an optional subject (if anything else is given)• Plus an optional payload to sign it off• Plus an optional gpg signature• Designed to be immobile• Changes not tracked during cloning• Use a branch if you want to move around

9Monday, February 6, 12

Objects live in the repo

• Git efficiently creates new objects• Objects are generally added, not destroyed• Unreferenced objects will garbage collect• Objects start “loose”, but can be “packed”• “Packs” represent objects as deltas• “Packs” are also created for repo transfer

10Monday, February 6, 12

Commits rule the repo

• One or more commits form the head of object chains

• Typically one head called “master”• Others can be made at will (“branches”)• Usually one commit in the repo that has no

parent commit (“root” commit)

11Monday, February 6, 12

Reaching out• From a commit, reaching the components:• Chase down the tree object to get to

directories and files as they existed at this commit time

• Chase down the parent objects to get to earlier commits and their respective trees

• Do this recursively, and you have all of history• And the SHA1 depends on all of that!

12Monday, February 6, 12

The git repo

• A “working tree” has a “.git” dir at the top level• Unlike CVS, SVN: no pollution of deeper

directories• This makes it friendly to recursive greps

13Monday, February 6, 12

The .git dir contains:• config – Configuration file (.ini style)• objects/* – The object repository• refs/heads/* – branches (like “master”)• refs/tags/* - tags• logs/* - logs• refs/remotes/* - tracking others• index – the “index cache” (described shortly)• HEAD – points to one of the branches (the

“current branch”, where commits go)

14Monday, February 6, 12

The index (or “cache”)• A directory of blob objects• Represents the “next commit”• “Add files” to put current contents in• “Commit” takes the current index and makes it

a real commit object• Diff between HEAD and index:• changed things not yet committed

• Diff between index and working dir:• changed things not yet added• untracked things

15Monday, February 6, 12

What’s in a name?

• Git doesn’t record explicit renaming• Nor expect you to declare it• Exact renaming determined by SHA1• Copy-paste-edits detected by similarity• Computer better than you at that• Explicit tracking will be wrong sometimes• Being wrong breaks merges

16Monday, February 6, 12

Git speaks and listens• Many protocols to transfer between repos• rsync, http, https, git, ssh, local files

• In the core, git also has:• import/export with CVS, SVN

• I use CVS/SVN import to have entire history of a project at 30K feet

• Third party solutions handle others• Git core also includes cvs-server• A git repository can act like a CVS repository

for legacy clients or humans

17Monday, February 6, 12

Getting git• Get the latest “git-*.tar.gz” from

code.google.com/p/git-core• RPMs and Debian packages also exist• Track the git-developer archive:• git clone git://git.kernel.org/pub/scm/git/git.git

• Maintenance releases are very stable• I install mine “prefix=/opt/git”• add /opt/git/bin to PATH

18Monday, February 6, 12

Git commands• All git commands start with “git”• “git MUMBLE-FOO bar” has also been written

as “git-MUMBLE-FOO bar”• This allows a single entry “git” to be added to

the /usr/local/bin path• This works for internal calls as well• Manpages are still under “git-MUMBLE-FOO”• Unless you use “git help MUMBLE-FOO”• Or “git MUMBLE-FOO --help”

19Monday, February 6, 12

Porcelain and plumbing• Low-level git operations are called “plumbing”• Higher level actions are called “porcelain”• The git distro includes both• Use porcelain from command line• But don’t script with it• Future releases might change things

• Use plumbing for scripts• Intended to be upward compatible

20Monday, February 6, 12

Creating a repo• git init• Creates a .git in the current dir• Optional: edit .gitignore• “git add .” to add all files (except .git!)• Then “git commit” for the initial commit• Creates current branch named “master”

• Could also do this on a tarball• tar xvfz some-tarball.tgz; cd some-tarball• git init• git add .

21Monday, February 6, 12

Cloning• Creates a git repo from an existing repo• Generally creates a subdirectory• Your workfiles and .git are in there• Remote branches are “tracked”• Remote “HEAD” branch checked out as your

initial “master” branch as well• Clone repo identified as “origin”• But the name is otherwise unspecial

22Monday, February 6, 12

Committing• Your work product is more commits• These are always on a “branch”• A branch is just a named commit• When you commit, the former branch head

becomes the parent• The branch head moves to be the new commit• Thus, you’re creating a directed acyclic graph• ... rooted in branch heads

• A merge is just a commit with multiple parents

23Monday, February 6, 12

Typical work flow

• Edit edit edit• git add files/you/have changed/now• This adds the files to the index• “git add .” for adding all interesting files

• git status• Tells you differences between HEAD, index,

and working directory

24Monday, February 6, 12

Making the commit

• “git commit”• Popped into a text editor (or “-m msg”)• First text line used for “short logs”• Current branch is moved forward• And you’re back to more editing

25Monday, February 6, 12

But which branch?• Git encourages branching• A branch is just 41 text bytes!

• Typical work flow:• Think of something to do• git checkout -b topic-name master• work work work, commit to topic-name

• When your thing is done:• git checkout master• git merge topic-name• git branch -d topic-name

26Monday, February 6, 12

Working in parallel• You can have multiple topics active:• git checkout -b topic1 master• work work; commit; work work; commit• git checkout -b topic2 master• work work work; commit• git checkout topic1; work work; commit

• Decide how to bring them together• Merge: parallel histories• Rebase: serial histories• Each has pros and cons

27Monday, February 6, 12

The merge• git checkout master• git merge topic1; git branch -d topic1• This should be trivial (“fast forward”) merge

• git merge topic2• Conflicts may arise:• overlapping changes in text edits• files renamed two different ways

• You need to resolve, and continue:• git commit -a (describe the merge fix here)

28Monday, February 6, 12

The rebase• Rewrites commits• Breaks SHA1s: commits are lost!• Don’t rebase if you’ve published commits!

• git checkout topic2; git rebase master• topic2’s commits rewritten on top of master

• May result in merge conflicts:• git rebase --continue or --abort or --skip

• git rebase -i (interactive) is helpful• When rebased, merge is a fast forward:• git checkout master; git merge topic2

29Monday, February 6, 12

Read the history• git log• print the changes

• git log -p• print the changes, including a diff between

revisions• git log --stat• Summarize the changes with a diffstat

• git log -- file1 file2 dir3• Show changes only for listed files or subdirs

30Monday, February 6, 12

What’s the difference?• git diff• Diff between index and working tree• These are things you should “git add”• “git commit -a” will also make this list empty

• git diff HEAD• Difference between HEAD and working tree• “git commit -a” will make this empty

• git diff --cached• between HEAD and index• “git commit” (without -a) makes this empty

31Monday, February 6, 12

Other diffs• git diff OTHERBRANCH• Other branch and working tree

• git diff BRANCH1 BRANCH2• Difference between two branch heads

• git diff BRANCH1...BRANCH2• changes only on branch2 relative to common

• git diff --stat (other options)• Nice summary of changes

• git diff --dirstat (other options)• Summarize directory changes

32Monday, February 6, 12

Barking up the tree• Most commands take “tree-ish” args• SHA1 picks something absolutely• Can be abbreviated if not ambiguous

• HEAD, some-branch-name, some-tag-name, some-origin-name• Optionally followed by @{historical}

• “historical” can be:• yesterday, 2011-11-22, etc (date ref)• 1, 2, 3, etc (prior version of this ref)• “upstream” (upstream version of local)

33Monday, February 6, 12

Meet the parents

• Any of those on the prior slide, followed by:• ^n - “the n-th parent of an item” (default 1)• ~n - n ^1’s (so ~3 is ^1^1^1)• :path - pick the object from the tree

34Monday, February 6, 12

Tree Examples

• git diff HEAD^ HEAD• most recent change on current branch• Also: git diff HEAD~ HEAD

• git diff HEAD~3 HEAD• What damage did last three edits do?

35Monday, February 6, 12

Seeing the changes• gitk mytopic origin• Tk widget display of history• Shows changes back to common ancestor

• gitk --all• show everything

• gitk from..to• Just the changes in “to” that aren’t in “from”

• git show-branch from..to• Same thing for the Tk-challenged

36Monday, February 6, 12

Playing well with others• git clone creates “tracking” branches• Typically named “origin/master” etc• To share your work, first get up to date:• git fetch origin

• Now rebase your changes on upstream:• git rebase origin/master

• Or fetch/rebase in one step• git pull --rebase

• To push upstream:• git push

37Monday, February 6, 12

Resetting• git reset --soft• Makes all files “updated but not checked in”

• git reset --hard # DANGER• Forces working dir to look like last commit

• git reset --hard HEAD~3• Tosses most recent 3 commits• use “git revert” instead if you’ve published

• git checkout HEAD some/lost/file• Recover the version of some/lost/file from

the last commit

38Monday, February 6, 12

Ignoring things• Every directory can contain a .gitignore• lines starting with “!” mean “not”• lines without “/” are checked against

basename• otherwise, shell glob via fnmatch(3)• Leading / means “the current directory”

• Checked into the repository and tracked• Every repository can contain a .git/info/exclude• Both of these work together• But .git/info/exclude won’t be cloned

39Monday, February 6, 12

Configuration• Many commands have configurations• git config name value• set name to value• name can contain periods for sub-items

• git config name• get current value

• git config --global name [value]• Same, but with ~/.gitconfig• This applies to all git repos from a user

40Monday, February 6, 12

The stash• Creates temporary commits to represent:• current index (git add ...)• current working directory (git add .)

• Can rebase those onto new index later• Many uses, such as pull into dirty workdir:• git stash; git pull ...; git stash pop• Might result in conflicts, of course

• Multiple stashes can be in play• “git stash list” to show them

41Monday, February 6, 12

Other useful porcelain• git archive: export a tree as a tar/zip• git bisect: find the offensive commit• git cherry-pick: selective merging• git mv: rename a file/dir with the right index

manipulations• git rm: ditto for delete• git push: write to an upstream• git revert: add a commit that undoes a previous

commit• git blame: who wrote this?

42Monday, February 6, 12

Commit Advice

• Split changes into small logical steps• Ideally ones that pass the test suite again

• This helps for “blame” and “bisect”.• Easier to squash commits later than to break up• “git rebase -i” can squash, omit, reorder

43Monday, February 6, 12

Picking from branches

• Two main tools: “merge” and “cherry-pick”• Merge brings in all commits• Scales well for large workflows

• Cherry-pick brings in one or more• Great when a single patch is needed

44Monday, February 6, 12

git.git’s workflow• Four branches:• maint: fixes to existing releases• master: next release• next: testing for next master• pu: experimental features

• Each one is a descendent of the one above• Commit to the oldest branch needing patch• Then merge it upward:• maint to master to next to pu

45Monday, February 6, 12

Topic branches• Most features require several iterations• Commit these to topic branches during design• Easier to rehack or abandon this way

• Fork topic from the oldest main branch• Refresh-merge from that branch if needed• But don’t do that routinely

• Rebase topic branch if forked from wrong branch

• More details at “man 7 gitworkflows”

46Monday, February 6, 12

Testing integration• Merge from base branch to topic branch• ... on a new throw-away branch

• This branch is never merged back in• Just for testing

• Can be published publicly, if you make that clear• Otherwise, typically used only locally

• If integration fails, fix, and cherry-pick those back to the topic branch before final merge

47Monday, February 6, 12

Time to “git” dirty• Make a git repository:• mkdir git-tutorial• cd git-tutorial• git init• git config user.name “Randal Schwartz”• git config user.email merlyn@stonehenge.com

• Add some content:• echo "Hello World" >hello• echo "Silly example" >example

48Monday, February 6, 12

What’s up?

• git status• git add example hello• git status• git diff --cached

49Monday, February 6, 12

“git add” timing• Change the content of “hello”• echo "It's a new day for git" >>hello• git status• git diff

• Now commit the index (with old hello)• git commit -m initial• git status• git diff• git diff HEAD

50Monday, February 6, 12

git commit -a

• Note that we committed the version of “hello” at the time we added it!

• Fix this by adding -a nearly always:• git commit -a -m update• git status

51Monday, February 6, 12

What happened?• Ask for logs:• git log• git log -p• git log --stat --summary

• Tag, you’re it:• git tag my-first-tag

• Now we can always get back to that version later

52Monday, February 6, 12

Sharing the work• Create the clone:• cd ..• git clone git-tutorial my-git• cd my-git

• The git clone will often have some sort of transport path, like git: or rsync: or http:

• See what we’ve got:• git log -p

• Note that we have the entire history• And that the SHA1s are identical

53Monday, February 6, 12

Branching out

• Create branch “my-branch”• git checkout -b my-branch• git status

• Make some changes:• echo "Work, work, work" >>hello• git commit -a -m 'Some work.'

54Monday, February 6, 12

Conflicts

• Switch back, and make other changes:• git checkout master• echo "Play, play, play" >>hello• echo "Lots of fun" >>example• git commit -a -m 'Some fun.'

• We now have conflicting commits

55Monday, February 6, 12

Seeing the damage• In an X11 display:• gitk --all

• The --all means “all heads, branches, tags”• For the X11 challenged:• git show-branch --all• git log --pretty=oneline --abbrev-commit \

--graph --decorate --all• Handy for a mail message

56Monday, February 6, 12

Merging• We’re on “master”, and we want to merge in

the changes from my-branch• Select the merge:• git merge my-branch

• This fails, because we have a conflict in “hello”• See this with:• git status

• Edit “hello”, and commit:• git commit -a -m “Merge work in my-branch”

57Monday, February 6, 12

Did it work?• Verify the merge with:• gitk --all• git show-branch --all

• See changes back to the common ancestor:• gitk master my-branch• git show-branch master my-branch

• Note that master is only one edit from my-branch now (the merge patch-up)

• “git show” handy with merges:• git show HEAD

58Monday, February 6, 12

Merging the upstream• Master is now updated with my-branch changes• But my-branch is now lagging• We can merge back the other way:• git checkout my-branch• git merge master

• This will succeed as a “fast forward”• This means that the merge-from branch already

has all of our change history• So it’s just adding linear history to the end

59Monday, February 6, 12

Upstream changes• Let’s change origin a bit• cd ../git-tutorial• echo "some upstream change" >>other• git add other• git commit -a -m "upstream change"

• And now fetch it downstream• cd ../my-git• git fetch• gitk --all• git diff master..origin/master

60Monday, February 6, 12

Merge it in• Explicit merging• git checkout master• git merge origin/master

• Implicit fetch/merge• git pull

• Eliminating the bushy tree• git pull --rebase• (Fails in our example.. sigh.)

61Monday, February 6, 12

Splitting up a patch

• Sometimes, your changes are logically separate• echo “this change” >>hello• echo “unrelated change” >>example

• Now make two commits:• git add -p # interactively select hello change• git commit -m “fixed hello” # not -a!• git commit -a -m “fixed example”

62Monday, February 6, 12

Fixing a commit

• Oops, left out something on that last one• echo "another unrelated" >>example

• Now “amend” the patch:• git commit -a --amend

• This replaces the commit• Be careful that you haven’t pushed it!

63Monday, February 6, 12

For further info• See “Git (software)” in Wikipedia• And the git homepage http://git-scm.com/• Git wiki at https://git.wiki.kernel.org/• Wonderful Pro Git book: http://progit.org/book/• Get on the mailing list• Helpful people there• You can submit bugs, patches, ideas

• And the #git IRC channel (on Freenode)• Now “git” to it!

64Monday, February 6, 12