Git tries its darndest to be as hard as possible to use. More than anything, it suffers from too-many-options-itis and correspondingly confusing options, often a warning flag for a programming language or tool. But git’s core is good, and it has a lot of features that are quite endearing. I’m not an expert, but I’ll share what I know.
Git is a distribute version control system. The core concept of DVCSs is that there is no central repository, and that every user has a copy of the entire repository locally. The second part is a simple truth for all DVCSs. The first part is kind of a lie. Yes, you can operate with a bunch of independent nodes, and you’ll get nothing done.
Yes, in the end, you’ll have a central repository, or quite possibly multiple central repositories. But you’ll have something central. This is where you will deploy from, how you coordinate with teammates, and generally work, just like you did with CVS and SVN. So don’t let that freak you out.
Powered by the power of the concept of diff
Git is driven entirely by diffs — sets of relative changes (differences) to files, including new and deleted files. It is trivial to have git create output very similar to the output produced by Unix’s diff command.
This is a powerful metaphor, though. You’re now freed from worrying about file dates as a dirty marker. That is, you can make a change, save it, come back a week later, undo it, save it, and suddenly you don’t have a change to commit any longer. You don’t even have to do any sort of “revert to repository version” action.
These relative changes can be juggled and reorganized in a variety of ways. You can
- roll back to a certain change — git just undoes the diffs applied.
- push or pull changes to and from other repositories
- collect a bunch of commits and merge them into one commit — git calls that squashing.
- move a bunch of commits between branches — merging or sometimes rebasing (basing your commits on something else — get it?)
- you can stash a bunch of changes to replay later somewhere else — even if they’ve never been committed anywhere.
Day by day
Git is a bit like an onion, it has layers — lots of them. Specifically, I’m talking about when and where files and changes are tracked. You’ll come across these divisions fairly frequently, so it is helpful to be familiar with them.
- Untracked files – these are changes to a branch on the local filesystem. Git is aware of them, but will take no action unless explicitly acted upon. Note that you can change branches out from underneath these files, and they will remain as local changes — unless it conflicts with a file changed on the branch.
- Staged files – these are changes ready to commit. Think of these as a changeset, an atomic diff to apply to the branch. You can build this up over time, gathering all the changes before the commit. They will all share the same commit message, as well. These files are in your Git Index, a term you may come across from time to time.
- Changes to your local branch – these are committed, finished changes. They are stored locally, and can even be “uncommitted” if need be. But these changes are not shared with anyone. Since no one else knows about these changes, they are still subject to merges and merge conflicts.
- Changes in the shared repository – these are changes made to a branch remotely. This is where your changes go to live and be safe, just like CVS, SVN. While you’re doing work at all the other layers, this layer can and likely will change. I do recommend you sync with this repo as often as practical, and before you do any commits if you can.
The Git Community Book is a great resource for more on git and how to use it. In fact, it almost goes into too much detail on everything, just like Git does. But it is a great cookbook — search for anything git and you’ll likely have a hit there in the top 10 results.
In a future article, I’ll describe how to use git in practical, no-nonsense terms.