Have you memorised a few Git commands, without actually understanding what's going on? Then you've come to the right place! This how-to will help you level up in Git, going from being able to use Git to properly understanding it, getting a better grip on what is arguably one of the most important tools in software development at the moment.
This interactive tutorial visualises what is happening when you're using Git. As you scroll down the page, you will be guided through the most important basic concepts of Git, while applying them to a visualised example repository.
Imagine we're starting a new project that we want to manage using Git. This could be any type of project, but for the sake of this tutorial, let's say we're writing a book. The first order of action is to make Git aware of it. Inside the folder that is to house our book we initialise a new Git repository:
Git has created a new hidden folder called
.git — everything that Git knows about our book will live in there, separate from our actual project.
Now let's say that we've already created two files:
thanks.txt, in which we will keep track of who we need to include in the word of thanks, and
outline.txt, which contains a general outline of what our book will be about.
Let's give Git an actual version to control — it is, after all, a version control system. You have to make a conscious choice about what each version (or in Git speak: commit) includes; for now, we only
git add thanks.txt. This is called staging the file
thanks.txt: marking it for inclusion included in the next commit.
This is what our Git history looks like after our first
The circle represents this first commit — please disregard the labels next to it for now.
If, later on in our project, we revert back to this commit, all we would be left with is
thanks.txt in its current state.
So why did we not include
outline.txt in our first commit? Basically, a good rule-of-thumb is that a commit should only contain changes that can be undone together. For example, if we later want to overhaul the structure of our book, we might want to ditch the outline we just wrote without also throwing away our list of people to thank.
Therefore, let us tell Git about
outline.txt in a separate commit.
We stage it using
git add outline.txt, then create another commit:
As you can see, our new commit is connected to the previous one (its parent). You can think of a commit as containing a list of the changes needed to turn the files as they were at the previous commit into the way they are now.
Now consider the label reading
HEAD next to our newly created commit.
HEAD can be seen as a pointer: when a new commit is made, it will be a child of the commit
HEAD is pointing to.
As an example, let's say we spelled someone's name wrong. After correcting it, we
git add thanks.txt again, and create a new commit:
The new commit is based on the commit previously labeled
HEAD, and the label is then updated to refer to the new commit as
Now let's turn to the label dubbed
main. This is called a branch. Like
HEAD, branches point to a commit. Unlike
HEAD, you can name them yourself. Furthermore, you can have multiple branches.
We can easily create a new branch. Let's call it
git branch myBranch
As you can see, there now are two labels pointing to the
The current branch is still
main though. You can see what this means when we make some more changes and create another commit:
main branch moved along with
myBranch is still pointing to the previous commit. We will see why this is useful in a bit.
We can switch to
myBranch, meaning roughly that we will make
HEAD point to the same commit as
git switch myBranch
As said, a new commit will be the child of the commit
HEAD is currently pointing to.
You can now see why they were called branches: with
myBranch pointing to a different commit than
main, our simple timeline has suddenly evolved into a tree structure.
Since we are still on
myBranch, new commits will move it along with
Branches are useful because they allow us to work on multiple things in parallel, without those things interfering with each other. For example, you could start work on a new chapter in a branch that only contains changes to that chapter. Now, when your editor is bugging you to
finally submit a manuscript already darnit! (hypothetically), you can simply switch back to the
main branch, perhaps fix up a few typos, and then send it to your editor. All this without having to include a half-finished chapter that wasn't that interesting anyway.
Anyway, let's say we did finish that chapter. That's nice and all, but now it would be nice if we wouldn't have to redo the small changes we performed earlier in the
Of course Git can help us here: we can merge the state of the files at
main's commit with the state of the files in the current branch's latest commit:
git merge main
As you can see, Git has created a new merge commit for us in
myBranch. As opposed to regular commits, merge commits have multiple parents. This means that it contains multiple sets of changes: for each parent, the changes needed to make to it to make it incorporate the changes in the other parent(s).
Now is a good time to check whether both sets of changes work well together. In the case of our book, we could check whether our edits to our earlier chapters didn't mess up the layout of our new one.
If everything looks as expected,
myBranch now contains the latest complete version of our book. But now imagine that we'd have had another branch with another half-finished chapter. If we were to merge
myBranch into this other branch when that chapter is finished, and into another one for another chapter, and so on, then after a while we would lose track of which branch contains the latest version. It is therefore common practice to assign one branch as containing the latest complete version of a project. This branch is commonly called
main, so let's stick to that here.
However, if we switch to that branch now:
git switch main
…the chapter we wrote in
myBranch is gone. The reason for this is, of course, that while we have merged
myBranch, we haven't merged
main yet. This might feel cumbersome, but by first copying everything into
myBranch, we had the opportunity to check whether everything still looked fine and dandy after the merge. Since we now know that it did, we can merge it back to
main without risking it resulting in a book with a messed-up layout.
git merge myBranch
Once again, our merge created another commit.
main is now ahead of
myBranch, since it includes commits that are not in
myBranch (i.e. the merge commit).
The features we've touched upon up to now are already incredibly powerful in allowing you to work on multiple versions of the same project. They enable you to confidently make sweeping changes to your project without fear of losing anything. After all, if you're unsatisfied with your changes, you can always start anew from a commit that does not include those changes.
However, Git has more tricks up its sleeve. A lot more, in fact, but for now we will only focus on how it enables effective collaboration.
As we have seen, you have your own repository, consisting of the project files and a tree representing their history. The same holds true of other team members: each has a separate repository, including the project files in a certain state. None of these are special, however, a team usually decides by convention to have a single central repository that contains the code that will be distributed to the user later on. Much like the
main branch should have the latest version of those parts of the project you've completed, the central repository should include the latest completed work by every team member. Let's look at how we can get our completed work into the central repository.
Let's assume that our editor has initialised an empty Git repository at
https://publisher.com/book.git. Our first step is that we need to make our local repository aware of this other repository:
git remote add publisher https://publisher.com/book.git
From our local repository's point of view, the other repositories are called remotes. Each remote has a name; we called the one we just added
In this example, there's no code in the remote yet. Let's change that by pushing our main branch to
git push publisher main
Several things have now happened. First, all the commits needed to go from an empty repository to the state of our local repository at
main have been sent to the
publisher remote. Secondly, we have a local reference to the currently known state of the remote's
main branch as
publisher/main. Finally, our local
main branch is matched to
publisher/main, roughly meaning that Git knows they are related.
We can simply continue working on our book and create a new commit as usual:
As long as do not push, our local changes will not be shared with others. While we were making our local changes, however, it is very well possible that our copy editor has proofread our first draft and submitted some corrections to the
To check whether this is the case, let's ask what has happened since the last time we contacted it (i.e. during our push):
git fetch publisher
And indeed, a new commit was added to
main branch. As you can see, while we weren't paying attention, our version of the branch sneakily diverged from the one at our publisher! This means we would not have been able to push the new version of our branch to our remote. It can be solved by merging the remote's work back into our own, making the history of our remote branch part of our local branch's history.
Like with regular branches, we can merge the remote branch into our current local branch:
git merge publisher/main
Once again, we can see that a merge commit has been created.
(Note that the above process of a fetch followed by a merge is fairly common. So common, in fact, that there is a single command
git pull that immediately executes both.)
Since our local branch is now ahead of the remote's, we can push it back to the remote.
git push publisher main
Our copy editor can now fetch the latest version that includes both the corrections he made, and the changes we made.
That wraps up this Git tutorial! I hope it helped you grasp the fundamental concepts, so that you won't feel completely lost when you have to perform more advanced operations in Git. If you're interested, do check out the source code of this tutorial (naturally, in a Git repository), and let me know what you think.