Git Tutorial
- Details
- Published on Saturday, 21 February 2015 12:28
Description
GIT is a distributed VCS (version control system) which targets speed amongst other improvements in regard to other VCS. It was developed by Linus Torvalds in order to provide a better VCS for development of Linux Kernel thus moving out form the proprietary BitKeeper. Git is more complex to learn that SVN but provides a number of improvements especially when working with distributed repositories. Among the major differences are that:
- GIT it is faster than SVN;
- GIT repositories are smaller than SVN repositories (data in GIT repositories is represented by a highly compressed binary structure);
- GIT metadata is stored in a single location in the root of the repository (in the .git folder) and the SVN metadata is scattered in each folder from the repository;
- branching in GIT is simpler than branching in SVN.
There are however drawbacks of Git compared to SVN not only in terms of the steep learning curve - for example IDE and tooling support for SVN is more mature and easier to use in most cases than the one provided for Git. One of the first things for new Git users (especially those that come from the world of SVN) is the fact that when switching between branches you work in the same directory structure - just the contents of files are switched.
Git is initially developed under Linux but has ports for multiple operating systems. The following picture provides a general overview of how a Git infrastructure looks like compared to an SVN infrastructure:
Let's assume the rectangles are branches - then in red is the area that keeps files not under control of the VCS ("untracked files") and already tracked but modified files (only in case of Git), in green is the area that keeps files ready to be sent to the server (in case of SVN) or to another repository (in case of Git) and in yellow is the area that keeps files ready to be added to a single commit (also called "staging area") that goes to the green area. As you can see in SVN we don't have a staging area - in order to send files to server you have to follow a two step process of adding the files from the red area to the green area and then sending the files from the green area to the server in the form of a single commit. In Git however you first have to add any changes from the red area ("untracked" and modified files) to the yellow area (the staging area), then create a commit from the files in the yellow area and adding it to green area, and finally - pushing all commits from the green area to another Git repository. The process of retrieving modified files in SVN is also pretty simple - you just update your current branch with the latest changes from the SVN server. In Git however a separate "remote tracking branch" is first created when retrieving files from a remote repository branch - after that changes are applied (merged or rebased - this is clarified in the section with examples) from that remote tracking branch to the local repository branch - essentially meaning that in SVN this is a one-step process while in Git it is a two-step process. Moreover in Git you have to clone a full repository initially while in SVN you can checkout a separate folder from the SVN server.
There are many tools created to simplify usage of Git such as:
- gitk - a GUI client for viewing commit log history and changes (can be started directly from the command line by typing gitk);
- git gui - a GUI client for working with files in the current repository (adding files to the staging area, creating commits, pushing files to a remote repository and others);
- jGIT - Java library arround git;
- eGIT (Eclipse GIT) - the Eclipse integration for GIT (IntelliJ and Netbeans also provide integration for Git);
- Gerrit uses also JGit and incorporates a tool for source code reviews;
- GitHub provides a custom platform for working with GIT;
- GitLab also provides a custom platform for working with GIT;
- Stash is a commercial product that provide code review capabilities and integration with JIRA.
Usage & Configuration
Installing in Linux
Depending on the Linux distribution and the package manager it has you can install git directly from the command line. For example in Ubuntu in could be as simple as:
sudo apt-get install git |
---|
Installing in Windows
In Windows-based operating systems one of the most widely use options is to install Msysgit which provides native support for Git in Windows along with a bash shell emulator (Git Bash). Another option is to use git preinstalled with a Linux emulator for Windows such as Cygwin.
Basic configuration
The git configuration for a repository is stored centrally either for all users and all repositories, for a particular user and all repositories for that user or in the .git/config file of a repository (depending on your OS). You can either modify manually the contents of any of these file or using a Git command for the purpose - the following example configures a user name and email globally for all users and all repositories that will be used when creating commits:
git config --global user.name "John Douglas" |
---|
You can also configure particular tools for two-way or three-way merge or for diff purposes. The following example configures KDiff3 as both a merge and diff tool for git (the mergetool and difftool commands will use KDiff3):
|
---|
Basic Examples
Creating a new repository is pretty simple - the following example creates the 'sample' Git repository:
mkdir sample |
---|
Cloning a remote repository (thus getting all the files and remote repository metadata to a local repository) is also pretty straightforward - the following example clones a remote repository indicated by a URL into the equinox-rt folder:
git clone git://git.eclipse.org/gitroot/equinox/rt.equinox.bundles.git equinox-rt |
---|
Once the remote repository is cloned to the equinox-rt folder an alias called origin is created that points to the URL of the remote repository - when fetching changes from that remote repository of pushing changes to that remote repository you can use the origin alias to point to that repository - by default each branch in the local repository uses the origin alias for determining the location of fetch/push target branches.
Fetching changes from a remote repository can be done in several steps - the following example is a four-step process for updating the current branch with changes from the remote origin branch:
git fetch |
---|
When fetching changes from the remote repository being tracked by default (the one referenced by the origin alias) - a remote tracking branch is used to hold the fetched changes - this is done as in the first step. In order to apply the changes from the remote repository to the local branch you should not have any uncommited changes (whether or not they are present in the staging area) - for that reason you can place them in a temporary area called a stash - this is done in the second step. In the third step you apply the changes from the remote repository in your local branch but before your local commits - they are applied sequentially at the end of the rebase based on the order in which they are commited in the local branch. In the third step you can also do a merge instead of a rebase. The following picture describes the difference between the two:
During a rebase the commits from the remote repository (in this case C2 and C3) are applied from the remote tracking branch to the local branch after commit C1 - this is the last commit fetched from the remote branch. After the that the local commits (in this case C4 and C5 are applied after commits C4 and C5. As you can probably notice when applying commits C4 and C5 conflicts might occur (e.g. commit C4 conflicts with commit C2 from the remote branch). In that case Git stops the rebase process and gives you the opportunity to merge conflicting changes before continuing to apply local commits. After merge is performed you can either abort or continue with the rebase (using the --abort or --continue flags in a rebase command). If you do a merge then the commits are applied separately on top of commit C1 but at the end there is the so-called "merge" commit that combines the changes from both commit chains. You can also trigger merging of conflicting changes with the following command (note - a mergetool must be configured as specified in the previous section):
git mergetool |
---|
You can combine the fetching and rebasing from a remote repository by using the following command:
git pull --rebase |
---|
Pushing changes to a remote repository can be done in several steps:
git add . |
---|
First we are adding all changes to files and untracked files to the staging area - you can also provide a list of files instead of a dot (.) which denotes all files (if you want to remove a file you have to use the rm command instead of the add command). After that we create a commit with a proper commit message from all files in the staging area - you can also list particular files from the staging area during the commit so that only a subset of the files in the staging area are included to the commit. At the end we push that commit (and possible any other commits not already pushed) from to origin repository (the default one) and the remote branch that corresponds to the local one - if you omit the repository and the branch reference then you will push all local commits to the default origin repository into the corresponding remote branches.
Reverting changes in Git can be done based on the state of your files in the version control system:
- when the file is untracked - in that case it outside of Git control and you don't have to revert anything;
- when staged - the following example removes the changes from the hello.java file:
git reset HEAD hello.java |
---|
The first command removes the file from the staging area and the second one reverts the changes made to the file.
- when commited locally - in that case you have different options depending on what you want to achieve - whether you want to remove the commit and return the files to the stating area, whether you want to remove the commit and return the files as unstaged or whether you want to remove the commit and revert the changes made to the files - the three options can be achieved with the following commands:
git reset –soft HEAD^ |
---|
You can view all branches using the following commands (the second one displays also remote tracking branches):
git branch |
---|
The following example creates a new branch called test that starts from the first commit in the current branch and then creates a remote tracking branch by specifying the branch in a remote repository that will be tracked (for pusing/fetching changes) - in that case origin/master:
git checkout -b test |
---|
The above two steps can be combined with the following command:
git checkout origin/master -b test |
---|
Deleting a branch is done with the following command:
git branch -d test |
---|
In case you have any commits you have to force delete the branch:
git branch -D test |
---|
If you want to switch (let's say master) to another branch in your local repository you can use the following command:
git checkout master |
---|
Even shorter alternative:
git co master |
---|
Git stores references to remote repositories in the form of aliases - when you clone a repo a default alias called origin is created that points to the URL of the repo. In order to see all remotes in a repository you can use the following command:
git remote show |
---|
In order to see the URL of the origin repository you can use the following command:
git remote show origin |
---|
The following example adds a new remote repository with alias equinox-rt that points to the git://git.eclipse.org/gitroot/equinox/rt.equinox.bundles.git repository:
git remote add equinox-rt git://git.eclipse.org/gitroot/equinox/rt.equinox.bundles.git |
---|
The following command removes the equinox-rt alias:
git remote remove equinox-rt |
---|
You can inspect the changes to files in the current branch with the following command:
git status |
---|
You can list all the files that have beed commited in the current branch (and that are both in the Git index and the working tree):
git ls-files |
---|
The following command displays the commit log history - each commit is identified by a unique hash:
git log |
---|
The following command displayes the commit log history along with the list of changed files for each commit:
git log --stat |
---|
The following command displays the changes made to the sample.txt file in the current branch:
git log -p sample.txt |
---|
The following command shows the changes made to a particular directory:
git log -p -10 test/ |
---|
The following command shows the files changed for a commit that starts with as hash code of bd61ad98.
git show --pretty="format:" --name-only bd61ad98 |
---|
The following command allows you to get help for the push command (help is available for basically all of the git commands):
git help push |
---|
The following command creates a patch (diff) with the changes made in the last commit and saves it to a file patch.diff:
git diff HEAD HEAD^ > D:/patch.diff |
---|
The following command displayes the changes made in the last commit using the configured diff tool:
git difftool |
---|
The following example applies the patch.diff patch to the current branch:
git apply --reject D:/patch.diff |
---|
Once you have pushed your changes in your feature branch and pushed them to a remote repository (e.g. managed by Stash or GitHub) then you can create a pull request for a particular commit for addition to the master branch or another branch in possibly another repository.
References
Wikipedia's entry on Git
http://en.wikipedia.org/wiki/Git_%28software%29
Git tutorial (vogella.de)
http://www.vogella.com/tutorials/Git/article.html
Git book
http://git-scm.com/book/en/v2
Git/SVN comparison
https://git.wiki.kernel.org/index.php/GitSvnComparison
MsysGit
https://msysgit.github.io/
Cygwin
https://www.cygwin.com/
Using pull requests in Git Stash
https://confluence.atlassian.com/display/STASH/Using+pull+requests+in+Stash