Git.From the thorns to the stars.
Сергей Моренец25 апреля 2013 г.
Agenda
• Versioning and revision systems overview• Git under the microscope• Examples• Q & A
Glossary• VCS• SCM• RCS
Requirements• Storing content• Tracking changes to the content• Distributing the content and history with
collaborators
Lost in selection
Magic pill
SCCS• First VCS available on any Unix system• Developed in SNOBOL at Bell Labs in 1972• Prepared for IBM Systems/370 computers running
OS/360• Its file format is used in BitKeeper and other VCS• Introduced repositories and locking mechanism
CVS• Ancestor of the revision control systems• First released in 1986 by Dick Grune• Simple technology with small learning curve• Useful for sharing and backing up the files• Tortoise CVS is a de facto client for CVS on
Windows• Introduces merging• Lifecycle ended in 2008
Apache Subversion• Created in 2000• Used to host Apache software products, also
Mono, SourceForge, Google Code• Most adopted SCM• Atomic commits• Maintains versioning for directories, renames, and
file metadata• Better support for branches and tagging
Centralized VCS
Distributed VCS
Distributed workflow
Git• Distributed revision control and source code
management system• Designed and developed by Linus Torvalds for Linux kernel development• Based on BitKeeper system• The development began on April 2005• Current version 1.8.2
Linus Torvalds• Swedish-speaking Finnish American• Chief architect and the project's coordinator of
the Linux kernel• Names after Linus Pauling and Linus Van Pelt • Second lieutenant of the Finnish Army• Winner of Millennium Technology Prize in 2012• Calls himself egotistical bastard
GitThe information manager from
hell
GitGlobal information
TRACKER
Junio Hamano• Graduated from Tokyo university• Git coordinator since 2005• Participated in the Linux development• Currently Google developer
Design Principles• Take CVS as an example of what not to do• Support distributed workflow• Scaling to thousand developers• Strong consistency and integrity support• Free
Features• Rapid branches and merging• Distributed development• Compatibility and emulation• Performance breakthrough• Revisions hashing• Garbage collector• Packed data storage
Git Repository• Database containing revisions and history of the
project• Retains complete copy of entire project• Maintains object store and index• Object store contains data files, log files and audit
information
Git Repository
Git Object Types• Blobs• Trees• Commits• Tags
Blobs• Each version of a file is represented as a blob. • Blob internal structure is ignored by Git.• A blob holds a file’s data but does not contain any
metadata about the file or even its name.• git show command examines contents of the blob
Trees• A tree object represents one level of directory
information. • It records blob identifiers and path names for all
the files in one directory. • It can also recursively reference other sub-trees
objects• Can be examined by git show or git ls-tree
commands
Commits• A commit object holds metadata for each change
including the author, commit date, and log message.
• Each commit points to a tree object that captures, the state of the repository at the time the commit was performed.
• git tag stable-1 1b2e1d63ff
Tags• A tag object assigns an arbitrary yet presumably
human readable name to a specific object, usually a commit.
• Contains tag type, tag message, author and object name.
• Can be examined by git cat-file command.
Git Repository
Git Object Model• Object store is organized and implemented as a
content-addressable storage system.• Each object has a unique name produced by
applying SHA1 to the contents of the object.• SHA1 hash is a sufficient index or name for that
object in the object database.• SHA1 values are 160-bit values that are
represented as a 40-digit hexadecimal number
• 9da581d910c9c4ac93557ca4859e767f5caf5169
Advantages• Git can determine equality of the objects by
comparing names.• The same content stored in two repositories will
always be stored under the same name.• Corruptions errors can be detected by checking
that the object's name is still the SHA1 hash of its contents.
Name Vs Content• Git stores each version of file not differences• Path name is separated from file contents• Object store is based on hashed computation on
file contents, not name
System Index mechanism Data store
Database Indexed Sequential Access Method
Data records
Unix FS Directories(/path) Blocks of data
Git .git/objects/hash Blob/tree objects
Git Directory• Stores all Git's history, configuration and meta
information for your project • There is only one git directory per project• By default it’s '.git' in the root of your project
Git Directory• Configuration:- config- description- info/exclude
• Helps configuring local repository
Git Directory• Hooks:-hooks
• Scripts that are run on certain lifecycle events of the repository
Git Directory• Object Database:-objects
• Default Git object database• Contains all content or pointers to local content. • All objects are immutable
Git Directory• References:-refs
• Stores reference pointers for branches, tags and heads. • A reference is a pointer toan object, usually of type tag or commit. • References changes as the repository evolves
Working Directory• Holds the current checkout of the files• Files can be removed or replaced by Git as
branches are switching• Working directory is temporary checkout place
Index• The index is a temporary and dynamic binary file
that captures a version of the project’s overall structure
• The project’s state could be represented by a commit and a tree from any point in the project’s history
• The index allows a separation between incremental development steps and the committal of those changes.
Index• Staging area between your working directory and
your repository• With commit data files from index are committed,
not from working directory• Can be viewed by git status command.
Data flow
Git Usage• Command-line tool(Git Bash)• Git GUI• IDE Plugin(JGit-based)
Git Bash• Command-line tool• UNIX-style utility• Last straw
Git GUI• MinGW – based• Former WinGit• No support
JGit• Lightweight, pure Java library implementing the
Git• EGit - Eclipse team provider for Git• NBGit - Git Support for NetBeans
Git Commands• init• checkout• fetch• pull• reset• merge• log
Git Commands• add• commit• push• branch• tag
First steps• Clone repository• Initialize repository
Clone Repository• git clone git://git.kernel.org/pub/scm/git/git.git• git clone http://www.kernel.org/pub/scm/git/git.git
Branching• Branch is graph of commits• Master branch is created by default• HEAD is pointer to the current branch• “git branch test” creates branch test.• “git checkout master” switches to branch master.• “git merge test” merges changes from test to
master.• Merges are done automatically.
Conflicts• If conflict cannot be resolved index and working
tree are left in the special state• “git status” shows unmerged files with conflict
markers
• git add file.txt • git commit
Roll Back• Reset• Checkout• Revert
Reset• git reset --hard HEAD• git reset --hard ORIG_HEAD
Checkout• git checkout HEAD MyClass.java
Revert• Rollbacks the last commit(s) in the repository• git revert HEAD• git revert HEAD~1 –m 2
Git References• All references are named with a slash-separated
path name starting with "refs“.
• -The branch "test" is short for "refs/heads/test". • The tag "v1.0" is short for "refs/tags/v1.0".• "origin/master" is short for
“refs/remotes/origin/master"
Git References• The HEAD file is a symbolic reference to the
branch we are currently using• git symbolic-ref HEAD
• ref: refs/heads/master
Advanced Git
Branching strategy• master• develop
Branching strategy• origin/master contains production-ready code• origin/develop containsdevelopment changes
Branching strategy• Feature branches• Release branches• Hotfix branches
Feature Branches• Feature branches (or topic branches) are used to
store new features• Can be added to develop or disregarded• git checkout –b newfeature develop
Release Branches• Release branches support preparation of a new
production release
Hotfix branches• Hotfix branches are related to new production
release.• Created in response to critical bugs in a
production environment.• Separates developing of thecurrent version and hotfix.
Branching strategy
Rebasing• git checkout -b mywork origin• git commit• git commit
Rebasing
Rebasing• git merge origin
Rebasing• git checkout mywork • git rebase origin
Rebasing
Stashing• git stash save “Stashing reason“• …• git stash apply
Treeishes• 980e3ccdaac54a0d4de358f3fe5d718027d96aae • 980e3ccdaac54a0d4• 980e3cc
Treeishes• 980e3ccdaac54a0d4de358f3fe5d718027d96aae • origin/master• refs/remotes/origin/master • master• refs/heads/master• v1.0 • refs/tags/v1.0
Issues search• git bisect start • git bisect good v1.0• git bisect bad master
• git bisect bad
• git show
• git bisect reset
Blamestorming• git blame sha1_file.c
• 0fcfd160 (Linus Torvalds 2005-04-18 8) */• 0fcfd160 (Linus Torvalds 2005-04-18 9) #include
"cache.h"• 1f688557 (Junio C Hamano 2005-06-27 10)
#include "delta.h"• a733cb60 (Linus Torvalds 2005-06-28 11)
#include "pack.h"
Git Hooks• Scripts placed in $GIT_DIR/hooks directory to
trigger action at certain points
• pre-commit• commit-msg• post-commit• post-checkout• post-merge
Object Store• All objects are stored as compressed contents by
their SHA-1 values.• They contain the object type, size and contents in
a gzipped format.• Loose objects and packed objects.
Loose Objects• Compressed data stored in a single file on disk• Every object written to a separate file
• SHA1 ab04d884140f7b0cf8bbf86d6883869f16a46f65
• GIT_DIR/objects/ab/04d884140f7b0cf8bbf86d6883869f16a46f65
Packed Objects• Packfile is a format which stores the part that has
changed in the second file• Uses heuristic algorithm to define files to pack• git gc packs the data• git unpack-objects converts data into loose
format
Ignoring files• # Ignore any file named sample.txt.• sample.txt• # Ignore Eclipse files• *.project• # except my.project with manual setting.• !my.project• # Ignore objects and archives.• *.[oa]
Scripting• Ruby• PHP• Python• Perl
Migration• Script support• CVS• SVN• Perforce• Mercurial
• fast-support tool
Migration• git-svn clone
http://my-project.googlecode.com/svn/trunk new-project
• ~/git.git/contrib/fast-import/git-p4 clone //depot/project/main@all myproject
GitHub
GitHub• Web-based hosting service• Was launched in April 2008• Git repository, paid for private projects and free
for open-source projects• Run by Ruby on Rails & Erlang• Provides feeds and followers
GrowthPeriod State2009 100000 users and 50000
repositories
2011 1 million users
2012 2 million users and 4 million repositories
2013 3 million users and 5 million repositories
Octocat• Introduced by Tom Preston-Werner, cofounder of
GitHub• Composed of octopus and cat words
Octocat
Resources• Version Control with Git, 2nd Edition, 2012• Pro Git, 2009
Pros• Painless branching• Separation between local repository and
upstream• Simplifies work in the distributed teams• Dramatic increase in performance• Integration with major VCS
Cons• Repository security risks• Latest revision question• Pessimistic locks• Big learning curve• Commit identifiers• Not optimal for single developers