Automated Version Control

Overview

Teaching: 15 min
Exercises: 0 min

Questions

What is version control and why should I use it?

Objectives

Understand the benefits of an automated version control system.

Understand the basics of how automated version control systems work.

Version control can keep track of the changes submitted by each collaborator, and when those changes were submitted. Even if you aren’t collaborating with other people, automated version control is much better than this situation:

"Piled Higher and Deeper" by Jorge Cham, http://www.phdcomics.com

“Piled Higher and Deeper” by Jorge Cham, http://www.phdcomics.com

This is a common problem: multiple nearly-identical versions of the same document. Popular word processors have features to help with revisions:

Microsoft Word’s Track Changes
Google Docs’ version history
LibreOffice’s Recording and Displaying Changes

Version control software Git (which we will be using in today’s workshop) enables you to create an initial version of a project (an initial commit)), and then commit changes incrementally as you modify the files in your project. This is not automatic, but something you do intentionally to mark your progress. Git tracks the state of all files in your project (called a repository) at each commit, and you can view any previous commits whenever you want to see a snapshot of your project at that time (this is called checking out a commit).

Changes Are Saved Sequentially

Changes tracked by a Git respository are separate from the project files. Unlike a Microsoft Word document, the file itself does not know its own history. The Git respository history is stored inside a directory called .git in the folder for your repository. It’s best to simply ignore this directory and not make any changes to the files there.

Different Versions Can be Saved

Multiple users can make changes to the same file in your project, but this can cause merge conflicts if the same part of the file has been edited by multiple users. More on this later.

Multiple Versions Can be Merged

Git is a powerful tool for tracking changes and developing new features in coding projects. Multiple collaborators can create branches) to work on a single project simultaneously, and then merge the different versions together. This is a standard practice in modern software development. Repositories can be hosted centrally in places like GitHub so that collaborators can easily sync changes by pulling) down the latest commits to their local copy of the repository.

Imagine you drafted an excellent paragraph for a paper you are writing, but later you select the paragraph and delete it. How would you retrieve the excellent version of your conclusion? Is it even possible?

Recovering the excellent version is only possible if you created a copy of the file when that paragraph was still part of your document. Version control only tracks what you commit. If you commit small changes to your files, and commit often, you will have an extremely detailed history of your changes. The number of “snapshots” of your work are equal to the number of commits you make.

Key Points

Version control is like an unlimited ‘undo’.

Version control also allows many people to work in parallel.

Setting Up Git

Overview

Teaching: 30 min
Exercises: 0 min

Questions

How do I get set up to use Git?

Objectives

Install Git

Configure git the first time it is used on a computer.

Understand the meaning of the --global configuration flag.

It’s time to install Git on your PC, so let’s take a detour over to the setup page and return once we’ve finished installing Git.

Now that Git is installed, we can open a terminal to interface with Git.

To open a terminal in MacOS, follow these instructions: https://support.apple.com/guide/terminal/open-or-quit-terminal-apd5265185d-f365-44cb-8b09-71a064a42125/mac

To open a terminal in Windows after installing Git for Windows, press the Windows key on your keyboard, type Git Bash, and press Enter.

For Linux, you will need to refer to the documentation for your Linux distribution.

When we use Git on a new computer for the first time, we need to configure a few things:

our name and email address,
what our preferred text editor is,
and that we want to use these settings globally (i.e. for every project).

On a command line, Git commands are written as git verb options, where verb is what we actually want to do and options is additional optional information which may be needed for the verb. Here is how a user named John Smith might configure Git:

$ git config --global user.name "John Smith"
$ git config --global user.email "john.smith@example.com"

Please use your own name and email address instead of John’s. This user name and email will be associated with your Git activity, which means that any changes pushed to GitHub, BitBucket, GitLab or another Git host server after this lesson will include this information.

For this lesson, we will be using GitHub and so the email address used should be the same as the one you will use when setting up your GitHub account. If you are concerned about privacy, please review GitHub’s instructions for keeping your email address private.

Line Endings

As with other keys, when you hit Enter or ↵ or on Macs, Return on your keyboard, your computer encodes this input as a character. Different operating systems use different character(s) to represent the end of a line. (You may also hear these referred to as newlines or line breaks.) Because Git uses these characters to compare files, it may cause unexpected issues when editing a file on different machines. Though it is beyond the scope of this lesson, you can read more about this issue in the Pro Git book.

You can change the way Git recognizes and encodes line endings using the core.autocrlf command to git config. The following settings are recommended:

On macOS and Linux:
$ git config --global core.autocrlf input
And on Windows:
$ git config --global core.autocrlf false

If you have a favorite text editor, you can change the configuration to use it, following the table below. For today’s workshop, we will use the nano text editor. Copy the configuration command for nano from the table below, and run it in your terminal.

Editor	Configuration command
Atom	`$ git config --global core.editor "atom --wait"`
nano	`$ git config --global core.editor "nano -w"`
BBEdit (Mac, with command line tools)	`$ git config --global core.editor "bbedit -w"`
Sublime Text (Mac)	`$ git config --global core.editor "/Applications/Sublime\ Text.app/Contents/SharedSupport/bin/subl -n -w"`
Sublime Text (Win, 32-bit install)	`$ git config --global core.editor "'c:/program files (x86)/sublime text 3/sublime_text.exe' -w"`
Sublime Text (Win, 64-bit install)	`$ git config --global core.editor "'c:/program files/sublime text 3/sublime_text.exe' -w"`
Notepad (Win)	`$ git config --global core.editor "c:/Windows/System32/notepad.exe"`
Notepad++ (Win, 32-bit install)	`$ git config --global core.editor "'c:/program files (x86)/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin"`
Notepad++ (Win, 64-bit install)	`$ git config --global core.editor "'c:/program files/Notepad++/notepad++.exe' -multiInst -notabbar -nosession -noPlugin"`
Kate (Linux)	`$ git config --global core.editor "kate"`
Gedit (Linux)	`$ git config --global core.editor "gedit --wait --new-window"`
Scratch (Linux)	`$ git config --global core.editor "scratch-text-editor"`
Emacs	`$ git config --global core.editor "emacs"`
Vim	`$ git config --global core.editor "vim"`
VS Code	`$ git config --global core.editor "code --wait"`

It is possible to reconfigure the text editor for Git whenever you want to change it.

Exiting Vim

Vim can be confusing and intimidating for new users. To exit without saving changes, press Esc then type :q! and hit Enter or ↵ or on Macs, Return. To save your changes and quit, press Esc then type :wq and hit Enter or ↵ or on Macs, Return. You do not need to be a Shell power user or Vim expert to complete this workshop; all Vim commands will be provided for you.

Git (2.28+) allows configuration of the name of the branch created when you initialize any new repository. We are going to change the default branch name to main.

$ git config --global init.defaultBranch main

Default Git branch naming

Source file changes are associated with a “branch.” For new learners in this lesson, it’s enough to know that branches exist, and this lesson uses one branch.
By default, Git will create a branch called master when you create a new repository with git init (as explained in the next Episode). This term evokes the racist practice of human slavery and the software development community has moved to adopt more inclusive language.

In 2020, most Git code hosting services transitioned to using main as the default branch. As an example, any new repository that is opened in GitHub and GitLab default to main. However, Git has not yet made the same change. As a result, local repositories must be manually configured have the same main branch name as most cloud services.

For versions of Git prior to 2.28, the change can be made on an individual repository level. The command for this is in the next episode. Note that if this value is unset in your local Git configuration, the init.defaultBranch value defaults to master.

The five commands we just ran above only need to be run once: the flag --global tells Git to use the settings for every project, in your user account, on this computer.

You can check your settings at any time:

$ git config --list

You can change your configuration as many times as you want: use the same commands to choose another editor or update your email address.

Key Points

Use git config with the --global option to configure a user name, email address, editor, and other preferences once per machine.

Creating a Repository

Overview

Teaching: 15 min
Exercises: 0 min

Questions

Where does Git store information?

Objectives

Create a local Git repository.

Describe the purpose of the .git directory.

Once Git is configured, we can start using it.

First, let’s a new directory in the home directory folder for our work (mkdir command), and then change the current working directory to the newly created one (cd command):

$ mkdir ~/planets
$ cd ~/planets

Then we tell Git to make planets a repository – a place where Git can store versions of our files:

$ git init

If we use ls to show the directory’s contents, it appears that nothing has changed:

$ ls

But if we add the -a flag to show everything, we can see that Git has created a hidden directory within planets called .git:

$ ls -a

.	..	.git

If we ever delete the .git subdirectory, we will lose the project’s history. It’s best to simply leave that directory alone.

In the setup episode, we set the default branch name to be main for more information on this change. We can check that everything is set up correctly by asking Git to tell us the status of our project:

$ git status

On branch main

No commits yet

nothing to commit (create/copy files and use "git add" to track)

If you are using a different version of git, the exact wording of the output might be slightly different.

Key Points

git init initializes a repository.

Git stores all of its repository data in the .git directory.

Tracking Changes

Overview

Teaching: 25 min
Exercises: 0 min

Questions

How do I record changes in Git?

How do I check the status of my version control repository?

How do I record notes about what changes I made and why?

Objectives

Go through the modify-add-commit cycle for one or more files.

Explain where information is stored at each stage of that cycle.

Distinguish between descriptive and non-descriptive commit messages.

First let’s make sure we’re still in the right directory. You should be in the planets directory.

$ cd ~/planets

Let’s create a file called mars.txt that contains some notes about the Red Planet’s suitability as a base. We’ll use nano to edit the file;

$ nano mars.txt

Type the text below into the mars.txt file:

Cold and dry, but everything is my favorite color

Notice the commands available in nano below where your typed input is displayed. The “^” symbol is for the CTRL key. Use CTRL+X to exit. Nano may ask you to confirm the file name. Confirm it as mars.txt.

Let’s first verify that the file was properly created by running the list command (ls):

$ ls

mars.txt

mars.txt contains a single line, which we can see by running:

$ cat mars.txt

Cold and dry, but everything is my favorite color

If we check the status of our project again, Git tells us that it’s noticed the new file:

$ git status

On branch main

No commits yet

Untracked files:
   (use "git add <file>..." to include in what will be committed)

	mars.txt

nothing added to commit but untracked files present (use "git add" to track)

The “untracked files” message means that there’s a file in the directory that Git isn’t keeping track of. We can tell Git to track a file using git add:

$ git add mars.txt

and then check that the right thing happened:

$ git status

On branch main

No commits yet

Changes to be committed:
  (use "git rm --cached <file>..." to unstage)

	new file:   mars.txt

Git now knows that it’s supposed to keep track of mars.txt, but it hasn’t recorded these changes as a commit yet. To get it to do that, we need to run one more command:

$ git commit -m "Start notes on Mars as a base"

[main (root-commit) f22b25e] Start notes on Mars as a base
 1 file changed, 1 insertion(+)
 create mode 100644 mars.txt

When we run git commit, Git takes everything we have told it to save by using git add and stores a copy permanently inside the special .git directory. This permanent copy is called a commit (or revision) and its short identifier is f22b25e. Your commit may have another identifier.

We use the -m flag (for “message”) to record a short, descriptive, and specific comment that will help us remember later on what we did and why. If we just run git commit without the -m option, Git will launch nano (or whatever other editor we configured as core.editor) so that we can write a longer message.

Good commit messages start with a brief (<50 characters) statement about the changes made in the commit. Generally, the message should complete the sentence “If applied, this commit will” . If you want to go into more detail, add a blank line between the summary line and your additional notes. Use this additional space to explain why you made changes and/or what their impact will be.

If we run git status now:

$ git status

On branch main
nothing to commit, working directory clean

it tells us everything is up to date. If we want to know what we’ve done recently, we can ask Git to show us the project’s history using git log:

$ git log

commit f22b25e3233b4645dabd0d81e651fe074bd8e73b
Author: FIRST_NAME LAST_NAME <your_email@example.com>
Date:   Thu Aug 22 09:51:46 2013 -0400

    Start notes on Mars as a base

git log lists all commits made to a repository in reverse chronological order. The listing for each commit includes the commit’s full identifier (which starts with the same characters as the short identifier printed by the git commit command earlier), the commit’s author, when it was created, and the log message Git was given when the commit was created.

Where Are My Changes?

If we run ls at this point, we will still see just one file called mars.txt. That’s because Git saves information about files’ history in the special .git directory mentioned earlier so that our filesystem doesn’t become cluttered (and so that we can’t accidentally edit or delete an old version).

We will add more information to the file. (Again, we’ll edit with nano and then cat the file to show its contents; you may use a different editor, and don’t need to cat.)

$ nano mars.txt
$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans

When we run git status now, it tells us that a file it already knows about has been modified:

$ git status

On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   mars.txt

no changes added to commit (use "git add" and/or "git commit -a")

The last line is the key phrase: “no changes added to commit”. We have changed this file, but we haven’t told Git we will want to save those changes (which we do with git add) nor have we saved them (which we do with git commit). So let’s do that now. It is good practice to always review our changes before saving them. We do this using git diff. This shows us the differences between the current state of the file and the most recently saved version:

$ git diff

diff --git a/mars.txt b/mars.txt
index df0654a..315bf3a 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1 +1,2 @@
 Cold and dry, but everything is my favorite color
+The two moons would make for interesting tides, if the planet had oceans

The output is cryptic because it is actually a series of commands for tools like editors and patch telling them how to reconstruct one file given the other. If we break it down into pieces:

The first line tells us that Git is producing output similar to the Unix diff command comparing the old and new versions of the file.
The second line tells exactly which versions of the file Git is comparing; df0654a and 315bf3a are unique computer-generated labels for those versions.
The third and fourth lines once again show the name of the file being changed.
The remaining lines are the most interesting, they show us the actual differences and the lines on which they occur. In particular, the + marker in the first column shows where we added a line.

After reviewing our change, it’s time to commit it:

$ git commit -m "Add notes about Mars' moons"

On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

	modified:   mars.txt

no changes added to commit (use "git add" and/or "git commit -a")

Whoops: Git won’t commit because we didn’t use git add first. Let’s fix that:

$ git add mars.txt
$ git commit -m "Add notes about Mars' moons"

[main 34961b1] Add notes about Mars' moons
 1 file changed, 1 insertion(+)

Git insists that we add files to the set we want to commit before actually committing anything. This allows us to commit our changes in stages and capture changes in logical portions rather than only large batches. For example, suppose we’re adding a few citations to relevant research to our thesis. We might want to commit those additions, and the corresponding bibliography entries, but not commit some of our work drafting the conclusion (which we haven’t finished yet).

To allow for this, Git has a special staging area where it keeps track of things that have been added to the current changeset but not yet committed.

Staging Area

If you think of Git as taking snapshots of changes over the life of a project, git add specifies what will go in a snapshot (putting things in the staging area), and git commit then actually takes the snapshot, and makes a permanent record of it (as a commit). If you don’t have anything staged when you type git commit, Git will prompt you to use git commit -a or git commit --all, which is kind of like gathering everyone to take a group photo! However, it’s almost always better to explicitly add things to the staging area, because you might commit changes you forgot you made. (Going back to the group photo simile, you might get an extra with incomplete makeup walking on the stage for the picture because you used -a!) Try to stage things manually, or you might find yourself searching for “git undo commit” more than you would like!

The Git Staging Area

Let’s watch as our changes to a file move from our editor to the staging area and into long-term storage. First, we’ll add another line to the file:

$ nano mars.txt
$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair

$ git diff

diff --git a/mars.txt b/mars.txt
index 315bf3a..b36abfd 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,2 +1,3 @@
 Cold and dry, but everything is my favorite color
 The two moons would make for interesting tides, if the planet had oceans
+The lack of humidity is good for my hair

So far, so good: we’ve added one line to the end of the file (shown with a + in the first column). Now let’s put that change in the staging area and see what git diff reports:

$ git add mars.txt
$ git diff

There is no output: as far as Git can tell, there’s no difference between what it’s been asked to save permanently and what’s currently in the directory. However, if we do this:

$ git diff --staged

diff --git a/mars.txt b/mars.txt
index 315bf3a..b36abfd 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,2 +1,3 @@
 Cold and dry, but everything is my favorite color
 The two moons would make for interesting tides, if the planet had oceans
+The lack of humidity is good for my hair

it shows us the difference between the last committed change and what’s in the staging area. Let’s save our changes:

$ git commit -m "Discuss concerns about Mars' climate for my hair"

[main 005937f] Discuss concerns about Mars' climate for my hair
 1 file changed, 1 insertion(+)

check our status:

$ git status

On branch main
nothing to commit, working directory clean

and look at the history of what we’ve done so far:

$ git log

commit 005937fbe2a98fb83f0ade869025dc2636b4dad5 (HEAD -> main)
Author: FIRST_NAME LAST_NAME <your_email@example.com>
Date:   Thu Aug 22 10:14:07 2013 -0400

    Discuss concerns about Mars' climate for my hair

commit 34961b159c27df3b475cfe4415d94a6d1fcd064d
Author: FIRST_NAME LAST_NAME <your_email@example.com>
Date:   Thu Aug 22 10:07:21 2013 -0400

    Add notes about Mars' moons

commit f22b25e3233b4645dabd0d81e651fe074bd8e73b
Author: FIRST_NAME LAST_NAME <your_email@example.com>
Date:   Thu Aug 22 09:51:46 2013 -0400

    Start notes on Mars as a base

Word-based diffing

Sometimes, e.g. in the case of the text documents a line-wise diff is too coarse. That is where the --color-words option of git diff comes in very useful as it highlights the changed words using colors.

Paging the Log

When the output of git log is too long to fit in your screen, git uses a program to split it into pages of the size of your screen. When this “pager” is called, you will notice that the last line in your screen is a :, instead of your usual prompt.

To get out of the pager, press Q.

To move to the next page, press Spacebar.

To search for some_word in all pages, press / and type some_word. Navigate through matches pressing N.

Directories

Two important facts you should know about directories in Git.
Git does not track directories, only files within them. If you try to add a directory that is completely empty, it will not be committed. You can place a file called .gitkeep in an empty directory to allow it to be committed. You could name it anything you like, but naming it .gitkeep will let future users know the purpose of the file without having to read it.
If you create a directory in your Git repository and populate it with files, you can add all files in the directory at once by:
git add <directory-with-files>

To recap, when we want to add changes to our repository, we first need to add the changed files to the staging area (git add) and then commit the staged changes to the repository (git commit):

The Git Commit Workflow

Key Points

git status shows the status of a repository.

Files can be stored in a project’s working directory (which users see), the staging area (where the next commit is being built up) and the local repository (where commits are permanently recorded).

git add puts files in the staging area.

git commit saves the staged content as a new commit in the local repository.

Write a commit message that accurately describes your changes.

Exploring History

Overview

Teaching: 25 min
Exercises: 0 min

Questions

How can I identify old versions of files?

How do I review my changes?

How can I recover old versions of files?

Objectives

Explain what the HEAD of a repository is and how to use it.

Identify and use Git commit numbers.

Compare various versions of tracked files.

Restore old versions of files.

As we saw in the previous episode, we can refer to commits by their identifiers. You can refer to the most recent commit of the working directory by using the identifier HEAD.

We’ve been adding one line at a time to mars.txt, so it’s easy to track our progress by looking, so let’s do that using our HEADs. Before we start, let’s make a change to mars.txt, adding yet another line.

$ nano mars.txt
$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair
An ill-considered change

Now, let’s see what we get.

$ git diff HEAD mars.txt

diff --git a/mars.txt b/mars.txt
index b36abfd..0848c8d 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1,3 +1,4 @@
 Cold and dry, but everything is my favorite color
 The two moons would make for interesting tides, if the planet had oceans
 The lack of humidity is good for my hair
+An ill-considered change.

which is the same as what you would get if you leave out HEAD (try it). The real goodness in all this is when you can refer to previous commits. We do that by adding ~1 (where “~” is “tilde”, pronounced [til-duh]) to refer to the commit one before HEAD.

$ git diff HEAD~1 mars.txt

If we want to see the differences between older commits we can use git diff again, but with the notation HEAD~1, HEAD~2, and so on, to refer to them:

$ git diff HEAD~2 mars.txt

diff --git a/mars.txt b/mars.txt
index df0654a..b36abfd 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1 +1,4 @@
 Cold and dry, but everything is my favorite color
+The two moons would make for interesting tides, if the planet had oceans
+The lack of humidity is good for my hair
+An ill-considered change

We could also use git show which shows us what changes we made at an older commit as well as the commit message, rather than the differences between a commit and our working directory that we see by using git diff.

$ git show HEAD~2 mars.txt

commit f22b25e3233b4645dabd0d81e651fe074bd8e73b
Author: FIRST_NAME LAST_NAME <your_email@example.com>
Date:   Thu Aug 22 09:51:46 2013 -0400

    Start notes on Mars as a base

diff --git a/mars.txt b/mars.txt
new file mode 100644
index 0000000..df0654a
--- /dev/null
+++ b/mars.txt
@@ -0,0 +1 @@
+Cold and dry, but everything is my favorite color

In this way, we can build up a chain of commits. The most recent end of the chain is referred to as HEAD; we can refer to previous commits using the ~ notation, so HEAD~1 means “the previous commit”, while HEAD~123 goes back 123 commits from where we are now.

We can also refer to commits using those long strings of digits and letters that git log displays. These are unique IDs for the changes, and “unique” really does mean unique: every change to any set of files on any computer has a unique 40-character identifier. Our first commit was given the ID f22b25e3233b4645dabd0d81e651fe074bd8e73b, so let’s try this:

$ git diff f22b25e3233b4645dabd0d81e651fe074bd8e73b mars.txt

diff --git a/mars.txt b/mars.txt
index df0654a..93a3e13 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1 +1,4 @@
 Cold and dry, but everything is my favorite color
+The two moons would make for interesting tides, if the planet had oceans
+The lack of humidity is good for my hair
+An ill-considered change

That’s the right answer, but typing out random 40-character strings is annoying, so Git lets us use just the first few characters (typically seven for normal size projects):

$ git diff f22b25e mars.txt

diff --git a/mars.txt b/mars.txt
index df0654a..93a3e13 100644
--- a/mars.txt
+++ b/mars.txt
@@ -1 +1,4 @@
 Cold and dry, but everything is my favorite color
+The two moons would make for interesting tides, if the planet had oceans
+The lack of humidity is good for my hair
+An ill-considered change

All right! So we can save changes to files and see what we’ve changed. Now, how can we restore older versions of things? Let’s suppose we change our mind about the last update to mars.txt (the “ill-considered change”).

git status now tells us that the file has been changed, but those changes haven’t been staged:

$ git status

On branch main
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git checkout -- <file>..." to discard changes in working directory)

    modified:   mars.txt

no changes added to commit (use "git add" and/or "git commit -a")

We can put things back the way they were by using git checkout:

$ git checkout HEAD mars.txt
$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair

As you might guess from its name, git checkout checks out (i.e., restores) an old version of a file. In this case, we’re telling Git that we want to recover the version of the file recorded in HEAD, which is the last saved commit. If we want to go back even further, we can use a commit identifier instead:

$ git checkout f22b25e mars.txt

$ cat mars.txt

Cold and dry, but everything is my favorite color

$ git status

On branch main
Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

    modified:   mars.txt

Notice that the changes are currently in the staging area. Again, we can put things back the way they were by using git checkout:

$ git checkout HEAD mars.txt

Don’t Lose Your HEAD

Above we used
$ git checkout f22b25e mars.txt
to revert mars.txt to its state after the commit f22b25e. But be careful! The command checkout has other important functionalities and Git will misunderstand your intentions if you are not accurate with the typing. For example, if you forget mars.txt in the previous command.
$ git checkout f22b25e
Note: checking out 'f22b25e'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

 git checkout -b <new-branch-name>

HEAD is now at f22b25e Start notes on Mars as a base
The “detached HEAD” is like “look, but don’t touch” here, so you shouldn’t make any changes in this state. After investigating your repo’s past state, reattach your HEAD with git checkout main.

It’s important to remember that we must use the commit number that identifies the state of the repository before the change we’re trying to undo. A common mistake is to use the number of the commit in which we made the change we’re trying to discard. In the example below, we want to retrieve the state from before the most recent commit (HEAD~1), which is commit f22b25e:

Git Checkout

So, to put it all together, here’s how Git works in cartoon form:

Simplifying the Common Case

If you read the output of git status carefully, you’ll see that it includes this hint:
(use "git checkout -- <file>..." to discard changes in working directory)
As it says, git checkout without a version identifier restores files to the state saved in HEAD. The double dash -- is needed to separate the names of the files being recovered from the command itself: without it, Git would try to use the name of the file as the commit identifier.

The fact that files can be reverted one by one tends to change the way people organize their work. If everything is in one large document, it’s hard (but not impossible) to undo changes to the introduction without also undoing changes made later to the conclusion. If the introduction and conclusion are stored in separate files, on the other hand, moving backward and forward in time becomes much easier.

Key Points

git diff displays differences between commits.

git checkout recovers old versions of files.

Ignoring Things

Overview

Teaching: 25 min
Exercises: 0 min

Questions

How can I tell Git to ignore files I don’t want to track?

Objectives

Configure Git to ignore specific files.

Explain why ignoring files can be useful.

What if we have files that we do not want Git to track for us, like backup files created by our editor or intermediate files created during data analysis? Let’s create a few dummy files:

$ mkdir results
$ touch a.dat b.dat c.dat results/a.out results/b.out

and see what Git says:

$ git status

On branch main
Untracked files:
  (use "git add <file>..." to include in what will be committed)

	a.dat
	b.dat
	c.dat
	results/

nothing added to commit but untracked files present (use "git add" to track)

Putting these files under version control would be a waste of disk space. What’s worse, having them all listed could distract us from changes that actually matter, so let’s tell Git to ignore them.

We do this by creating a file in the root directory of our project called .gitignore:

$ nano .gitignore
$ cat .gitignore

*.dat
results/

These patterns tell Git to ignore any file whose name ends in .dat and everything in the results directory. (If any of these files were already being tracked, Git would continue to track them.)

Once we have created this file, the output of git status is much cleaner:

$ git status

On branch main
Untracked files:
  (use "git add <file>..." to include in what will be committed)

	.gitignore

nothing added to commit but untracked files present (use "git add" to track)

The only thing Git notices now is the newly-created .gitignore file. You might think we wouldn’t want to track it, but everyone we’re sharing our repository with will probably want to ignore the same things that we’re ignoring. Let’s add and commit .gitignore:

$ git add .gitignore
$ git commit -m "Ignore data files and the results folder."
$ git status

On branch main
nothing to commit, working directory clean

As a bonus, using .gitignore helps us avoid accidentally adding files to the repository that we don’t want to track:

$ git add a.dat

The following paths are ignored by one of your .gitignore files:
a.dat
Use -f if you really want to add them.

If we really want to override our ignore settings, we can use git add -f to force Git to add something. For example, git add -f a.dat. We can also always see the status of ignored files if we want:

$ git status --ignored

On branch main
Ignored files:
 (use "git add -f <file>..." to include in what will be committed)

        a.dat
        b.dat
        c.dat
        results/

nothing to commit, working directory clean

Key Points

The .gitignore file tells Git what files to ignore.

Remotes in GitHub

Overview

Teaching: 40 min
Exercises: 0 min

Questions

How do I share my changes with others on the web?

Objectives

Explain what remote repositories are and why they are useful.

Push to or pull from a remote repository.

Version control really comes into its own when we begin to collaborate with other people. We already have most of the machinery we need to do this; the only thing missing is to copy changes from one repository to another.

Systems like Git allow us to move work between any two repositories. In practice, though, it’s easiest to use one copy as a central hub, and to keep it on the web rather than on someone’s laptop. Most programmers use hosting services like GitHub, Bitbucket or GitLab to hold those main copies; we’ll explore the pros and cons of this in a later episode.

Let’s start by sharing the changes we’ve made to our current project with the world. To this end we are going to create a remote repository that will be linked to our local repository.

1. Create a remote repository

Creating a Repository on GitHub (Step 1)

Name your repository “planets” and then click “Create Repository”.

Note: Since this repository will be connected to a local repository, it needs to be empty. Leave “Initialize this repository with a README” unchecked, and keep “None” as options for both “Add .gitignore” and “Add a license.” See the “GitHub License and README files” exercise below for a full explanation of why the repository needs to be empty.

Creating a Repository on GitHub (Step 2)

As soon as the repository is created, GitHub displays a page with a URL and some information on how to configure your local repository:

Creating a Repository on GitHub (Step 3)

This effectively does the following on GitHub’s servers:

$ mkdir planets
$ cd planets
$ git init

If you remember back to the earlier episode where we added and committed our earlier work on mars.txt, we had a diagram of the local repository which looked like this:

The Local Repository with Git Staging Area

Now that we have two repositories, we need a diagram like this:

Freshly-Made GitHub Repository

Note that our local repository still contains our earlier work on mars.txt, but the remote repository on GitHub appears empty as it doesn’t contain any files yet.

2. Connect local to remote repository

Now we connect the two repositories. We do this by making the GitHub repository a remote for the local repository. The home page of the repository on GitHub includes the URL string we need to identify it:

Where to Find Repository URL on GitHub

Click on the ‘SSH’ link to change the protocol from HTTPS to SSH.

HTTPS vs. SSH

We use SSH here because, while it requires some additional configuration, it is a security protocol widely used by many applications. The steps below describe SSH at a minimum level for GitHub. A supplemental episode to this lesson discusses advanced setup and concepts of SSH and key pairs, and other material supplemental to git related SSH.

Changing the Repository URL on GitHub

Copy that URL from the browser, go into the local planets repository, and run this command:

$ git remote add origin git@github.com:YOUR_USER/planets.git

Make sure to use the URL for your repository rather than the example provided: the only difference should be your username instead of YOUR_USER.

origin is a local name used to refer to the remote repository. It could be called anything, but origin is a convention that is often used by default in git and GitHub, so it’s helpful to stick with this unless there’s a reason not to.

We can check that the command has worked by running git remote -v:

$ git remote -v

origin   git@github.com:YOUR_USER/planets.git (fetch)
origin   git@github.com:YOUR_USER/planets.git (push)

We’ll discuss remotes in more detail in the next episode, while talking about how they might be used for collaboration.

3. SSH Background and Setup

Before you can connect to a remote repository, you need to set up a way for your machine to authenticate with GitHub.

We are going to set up the method that is commonly used by many different services to authenticate access on the command line. This method is called Secure Shell Protocol (SSH). SSH is a cryptographic network protocol that allows secure communication between computers using an otherwise insecure network.

SSH uses what is called a key pair. This is two keys that work together to validate access. One key is publicly known and called the public key, and the other key called the private key is kept private.

You can think of the public key as a padlock, and only you have the key (the private key) to open it. You use the public key where you want a secure method of communication, such as your GitHub account.

What we will do now is the minimum required to set up the SSH keys and add the public key to a GitHub account.

Keeping your keys secure

You shouldn’t really forget about your SSH keys, since they keep your account secure. It’s good practice to audit your secure shell keys every so often. Especially if you are using multiple computers to access your account.

We will run the list command to check what key pairs already exist on your computer.

ls -al ~/.ssh

Your output is going to look a little different depending on whether or not SSH has ever been set up on the computer you are using.

Since your user account on the lab server is brand-new, your output will be something like this:

ls: cannot access '/c/Users/YOUR USER/.ssh': No such file or directory

If SSH has been set up on the computer you’re using, the public and private key pairs will be listed. The file names are either id_ed25519/id_ed25519.pub or id_rsa/id_rsa.pub depending on how the key pairs were set up.

3.1 Create an SSH key pair

To create an SSH key pair, use this command, where the -t option specifies which type of algorithm to use and -C attaches a comment to the key (here, an email address):

$ ssh-keygen -t ed25519 -C "YOUR_EMAIL_ADDDRESS"

If you are using a legacy system that doesn’t support the Ed25519 algorithm, use: $ ssh-keygen -t rsa -b 4096 -C "your_email@example.com"

Generating public/private ed25519 key pair.
Enter file in which to save the key (/c/Users/YOUR USER/.ssh/id_ed25519):

We want to use the default file, so just press Enter.

Created directory '/c/Users/YOUR USER/.ssh'.
Enter passphrase (empty for no passphrase):

Now, it is prompting you for a passphrase. Since you are using a shared lab environment, you should use a passphrase. Be sure to use something memorable or save your passphrase somewhere, as there is no “reset my password” option.

Enter same passphrase again:

After entering the same passphrase a second time, we receive the confirmation

Your identification has been saved in /c/Users/YOUR USER/.ssh/id_ed25519
Your public key has been saved in /c/Users/YOUR USER/.ssh/id_ed25519.pub
The key fingerprint is:
SHA256:SMSPIStNyA00KPxuYu94KpZgRAYjgt9g4BA4kFy3g1o YOUR_EMAIL
The key's randomart image is:
+--[ED25519 256]--+
|^B== o.          |
|%*=.*.+          |
|+=.E =.+         |
| .=.+.o..        |
|....  . S        |
|.+ o             |
|+ =              |
|.o.o             |
|oo+.             |
+----[SHA256]-----+

The “identification” is actually the private key. You should never share it. The public key is appropriately named. The “key fingerprint” is a shorter version of a public key.

Now that we have generated the SSH keys, we will find the SSH files when we check.

ls -al ~/.ssh

drwxr-xr-x 1 YOUR USER 197121   0 Jul 16 14:48 ./
drwxr-xr-x 1 YOUR USER 197121   0 Jul 16 14:48 ../
-rw-r--r-- 1 YOUR USER 197121 419 Jul 16 14:48 id_ed25519
-rw-r--r-- 1 YOUR USER 197121 106 Jul 16 14:48 id_ed25519.pub

3.2 Copy the public key to GitHub

Now we have a SSH key pair and we can run this command to check if GitHub can read our authentication.

ssh -T git@github.com

The authenticity of host 'github.com (192.30.255.112)' can't be established.
RSA key fingerprint is SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? y
Please type 'yes', 'no' or the fingerprint: yes
Warning: Permanently added 'github.com' (RSA) to the list of known hosts.
git@github.com: Permission denied (publickey).

Right, we forgot that we need to give GitHub our public key!

First, we need to copy the public key. Be sure to include the .pub at the end, otherwise you’re looking at the private key.

cat ~/.ssh/id_ed25519.pub

ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIDmRA3d51X0uu9wXek559gfn6UFNF69yZjChyBIU2qKI YOUR_EMAIL

Now, going to GitHub.com, click on your profile icon in the top right corner to get the drop-down menu. Click “Settings,” then on the settings page, click “SSH and GPG keys,” on the left side “Account settings” menu. Click the “New SSH key” button on the right side. Now, you can add the title, paste your SSH key into the field, and click the “Add SSH key” to complete the setup.

Now that we’ve set that up, let’s check our authentication again from the command line.

$ ssh -T git@github.com

Hi YOUR_USER! You've successfully authenticated, but GitHub does not provide shell access.

Good! This output confirms that the SSH key works as intended. We are now ready to push our work to the remote repository.

4. Push local changes to a remote

Now that authentication is setup, we can return to the remote. This command will push the changes from our local repository to the repository on GitHub:

$ git push origin main

Since you set up a passphrase, it will prompt for it. If you completed advanced settings for your authentication, it will not prompt for a passphrase.

Enumerating objects: 16, done.
Counting objects: 100% (16/16), done.
Delta compression using up to 8 threads.
Compressing objects: 100% (11/11), done.
Writing objects: 100% (16/16), 1.45 KiB | 372.00 KiB/s, done.
Total 16 (delta 2), reused 0 (delta 0)
remote: Resolving deltas: 100% (2/2), done.
To https://github.com/YOUR_USER/planets.git
 * [new branch]      main -> main

Proxy

If the network you are connected to uses a proxy, there is a chance that your last command failed with “Could not resolve hostname” as the error message. To solve this issue, you need to tell Git about the proxy:
$ git config --global http.proxy http://user:password@proxy.url
$ git config --global https.proxy https://user:password@proxy.url
When you connect to another network that doesn’t use a proxy, you will need to tell Git to disable the proxy using:
$ git config --global --unset http.proxy
$ git config --global --unset https.proxy

Password Managers

If your operating system has a password manager configured, git push will try to use it when it needs your username and password. For example, this is the default behavior for Git Bash on Windows. If you want to type your username and password at the terminal instead of using a password manager, type:
$ unset SSH_ASKPASS
in the terminal, before you run git push. Despite the name, Git uses SSH_ASKPASS for all credential entry, so you may want to unset SSH_ASKPASS whether you are using Git via SSH or https.

You may also want to add unset SSH_ASKPASS at the end of your ~/.bashrc to make Git default to using the terminal for usernames and passwords.

Our local and remote repositories are now in this state:

GitHub Repository After First Push

The ‘-u’ Flag

You may see a -u option used with git push in some documentation. This option is synonymous with the --set-upstream-to option for the git branch command, and is used to associate the current branch with a remote branch so that the git pull command can be used without any arguments. To do this, simply use git push -u origin main once the remote has been set up.

We can pull changes from the remote repository to the local one as well:

$ git pull origin main

From https://github.com/YOUR_USER/planets
 * branch            main     -> FETCH_HEAD
Already up-to-date.

Pulling has no effect in this case because the two repositories are already synchronized. If someone else had pushed some changes to the repository on GitHub, though, this command would download them to our local repository.

Key Points

A local Git repository can be connected to one or more remote repositories.

Use the SSH protocol to connect to remote repositories.

git push copies changes from a local repository to a remote repository.

git pull copies changes from a remote repository to a local repository.

Collaborating

Overview

Teaching: 25 min
Exercises: 0 min

Questions

How can I use version control to collaborate with other people?

Objectives

Clone a remote repository.

Collaborate by pushing to a common repository.

Describe the basic collaborative workflow.

For the next step, get into pairs. One person will be the “Owner” and the other will be the “Collaborator”. The goal is that the Collaborator add changes into the Owner’s repository. We will switch roles at the end, so both persons will play Owner and Collaborator.

Practicing By Yourself

If you’re working through this lesson on your own, you can carry on by opening a second terminal window. This window will represent your partner, working on another computer. You won’t need to give anyone access on GitHub, because both ‘partners’ are you.

The Owner needs to give the Collaborator access. On GitHub, click the “Settings” button on the right, select “Collaborators”, click “Add people”, and then enter your partner’s username.

Adding Collaborators on GitHub

To accept access to the Owner’s repo, the Collaborator needs to go to https://github.com/notifications or check for email notification. Once there she can accept access to the Owner’s repo.

Next, the Collaborator needs to download a copy of the Owner’s repository to her machine. This is called “cloning a repo”.

The Collaborator doesn’t want to overwrite her own version of planets.git, so needs to clone the Owner’s repository to a different location than her own repository with the same name.

To clone the Owner’s repo into her home folder, the Collaborator enters:

$ git clone git@github.com:YOUR_USER/planets.git ~/other_user-planets

Replace ‘other_user’ with the Owner’s username.

If you choose to clone without the clone path (~/Desktop/other_user-planets) specified at the end, you will clone inside your own planets folder! Make sure to navigate to the Desktop folder first.

After Creating Clone of Repository

The Collaborator can now make a change in her clone of the Owner’s repository, exactly the same way as we’ve been doing before:

$ cd ~/Desktop/other_user-planets
$ nano pluto.txt
$ cat pluto.txt

It is so a planet!

$ git add pluto.txt
$ git commit -m "Add notes about Pluto"

 1 file changed, 1 insertion(+)
 create mode 100644 pluto.txt

Then push the change to the Owner’s repository on GitHub:

$ git push origin main

Enumerating objects: 4, done.
Counting objects: 4, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (2/2), done.
Writing objects: 100% (3/3), 306 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/other_user/planets.git
   9272da5..29aba7c  main -> main

Note that we didn’t have to create a remote called origin: Git uses this name by default when we clone a repository. (This is why origin was a sensible choice earlier when we were setting up remotes by hand.)

Take a look at the Owner’s repository on GitHub again, and you should be able to see the new commit made by the Collaborator. You may need to refresh your browser to see the new commit.

Some more about remotes

In this episode and the previous one, our local repository has had a single “remote”, called origin. A remote is a copy of the repository that is hosted somewhere else, that we can push to and pull from, and there’s no reason that you have to work with only one. For example, on some large projects you might have your own copy in your own GitHub account (you’d probably call this origin) and also the main “upstream” project repository (let’s call this upstream for the sake of examples). You would pull from upstream from time to time to get the latest updates that other people have committed.

Remember that the name you give to a remote only exists locally. It’s an alias that you choose - whether origin, or upstream, or fred - and not something intrinstic to the remote repository.

The git remote family of commands is used to set up and alter the remotes associated with a repository. Here are some of the most useful ones:

git remote -v lists all the remotes that are configured (we already used this in the last episode)

git remote add [name] [url] is used to add a new remote

git remote remove [name] removes a remote. Note that it doesn’t affect the remote repository at all - it just removes the link to it from the local repo.

git remote set-url [name] [newurl] changes the URL that is associated with the remote. This is useful if it has moved, e.g. to a different GitHub account, or from GitHub to a different hosting service. Or, if we made a typo when adding it!

git remote rename [oldname] [newname] changes the local alias by which a remote is known - its name. For example, one could use this to change upstream to fred.

To download the Collaborator’s changes from GitHub, the Owner now enters:

$ git pull origin main

remote: Enumerating objects: 4, done.
remote: Counting objects: 100% (4/4), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 3 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.
From https://github.com/other_user/planets
 * branch            main     -> FETCH_HEAD
   9272da5..29aba7c  main     -> origin/main
Updating 9272da5..29aba7c
Fast-forward
 pluto.txt | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 pluto.txt

Now the three repositories (Owner’s local, Collaborator’s local, and Owner’s on GitHub) are back in sync.

Key Points

git clone copies a remote repository to create a local repository with a remote called origin automatically set up.

Conflicts

Overview

Teaching: 15 min
Exercises: 0 min

Questions

What do I do when my changes conflict with someone else’s?

Objectives

Explain what conflicts are and when they can occur.

Resolve conflicts resulting from a merge.

As soon as people can work in parallel, they’ll likely step on each other’s toes. This will even happen with a single person: if we are working on a piece of software on both our laptop and a server in the lab, we could make different changes to each copy. Version control helps us manage these conflicts by giving us tools to resolve overlapping changes.

To see how we can resolve conflicts, we must first create one. The file mars.txt currently looks like this in both partners’ copies of our planets repository:

$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair

Let’s add a line to the collaborator’s copy only:

$ nano mars.txt
$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair
This line added to collaborator's copy

and then push the change to GitHub:

$ git add mars.txt
$ git commit -m "Add a line in collaborator's copy"

[main 5ae9631] Add a line in collaborator's  copy
 1 file changed, 1 insertion(+)

$ git push origin main

Enumerating objects: 5, done.
Counting objects: 100% (5/5), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 331 bytes | 331.00 KiB/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: Resolving deltas: 100% (2/2), completed with 2 local objects.
To https://github.com/OWNER/planets.git
   29aba7c..dabb4c8  main -> main

Now let’s have the owner make a different change to their copy without updating from GitHub:

$ nano mars.txt
$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair
This is a line added to the owner's copy

We can commit the change locally:

$ git add mars.txt
$ git commit -m "Add a line in the owner's copy"

[main 07ebc69] Add a line in my copy
 1 file changed, 1 insertion(+)

but Git won’t let us push it to GitHub:

$ git push origin main

To https://github.com/OWNER/planets.git
 ! [rejected]        main -> main (fetch first)
error: failed to push some refs to 'https://github.com/OWNER/planets.git'
hint: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details.

The Conflicting Changes

Git rejects the push because it detects that the remote repository has new updates that have not been incorporated into the local branch. What we have to do is pull the changes from GitHub, merge them into the copy we’re currently working in, and then push that. Let’s start by pulling:

$ git pull origin main --no-rebase

remote: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 3 (delta 2), reused 3 (delta 2), pack-reused 0
Unpacking objects: 100% (3/3), done.
From https://github.com/OWNER/planets
 * branch            main     -> FETCH_HEAD
    29aba7c..dabb4c8  main     -> origin/main
Auto-merging mars.txt
CONFLICT (content): Merge conflict in mars.txt
Automatic merge failed; fix conflicts and then commit the result.

The git pull command updates the local repository to include those changes already included in the remote repository. After the changes from remote branch have been fetched, Git detects that changes made to the local copy overlap with those made to the remote repository, and therefore refuses to merge the two versions to stop us from trampling on our previous work. The conflict is marked in in the affected file:

$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair
<<<<<<< HEAD
This is a line added to the owner's copy
=======
This line added to collaborator's copy
>>>>>>> dabb4c8c450e8475aee9b14b4383acc99f42af1d

Our change is preceded by <<<<<<< HEAD. Git has then inserted ======= as a separator between the conflicting changes and marked the end of the content downloaded from GitHub with >>>>>>>. (The string of letters and digits after that marker identifies the commit we’ve just downloaded.)

It is now up to us to edit this file to remove these markers and reconcile the changes. We can do anything we want: keep the change made in the local repository, keep the change made in the remote repository, write something new to replace both, or get rid of the change entirely. Let’s replace both so that the file looks like this:

$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair
We removed the conflict on this line

To finish merging, we add mars.txt to the changes being made by the merge and then commit:

$ git add mars.txt
$ git status

On branch main
All conflicts fixed but you are still merging.
  (use "git commit" to conclude merge)

Changes to be committed:

	modified:   mars.txt

$ git commit -m "Merge changes from GitHub"

[main 2abf2b1] Merge changes from GitHub

Now we can push our changes to GitHub:

$ git push origin main

Enumerating objects: 10, done.
Counting objects: 100% (10/10), done.
Delta compression using up to 8 threads
Compressing objects: 100% (6/6), done.
Writing objects: 100% (6/6), 645 bytes | 645.00 KiB/s, done.
Total 6 (delta 4), reused 0 (delta 0)
remote: Resolving deltas: 100% (4/4), completed with 2 local objects.
To https://github.com/OWNER/planets.git
   dabb4c8..2abf2b1  main -> main

Git keeps track of what we’ve merged with what, so we don’t have to fix things by hand again when the collaborator who made the first change pulls again:

$ git pull origin main

remote: Enumerating objects: 10, done.
remote: Counting objects: 100% (10/10), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 6 (delta 4), reused 6 (delta 4), pack-reused 0
Unpacking objects: 100% (6/6), done.
From https://github.com/OWNER/planets
 * branch            main     -> FETCH_HEAD
    dabb4c8..2abf2b1  main     -> origin/main
Updating dabb4c8..2abf2b1
Fast-forward
 mars.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

We get the merged file:

$ cat mars.txt

Cold and dry, but everything is my favorite color
The two moons would make for interesting tides, if the planet had oceans
The lack of humidity is good for my hair
We removed the conflict on this line

We don’t need to merge again because Git knows someone has already done that.

Git’s ability to resolve conflicts is very useful, but conflict resolution costs time and effort, and can introduce errors if conflicts are not resolved correctly. If you find yourself resolving a lot of conflicts in a project, consider these technical approaches to reducing them:

Pull from upstream more frequently, especially before starting new work
Use topic branches to segregate work, merging to main when complete
Make smaller more atomic commits
Where logically appropriate, break large files into smaller ones so that it is less likely that two authors will alter the same file simultaneously

Conflicts can also be minimized with project management strategies:

Clarify who is responsible for what areas with your collaborators
Discuss what order tasks should be carried out in with your collaborators so that tasks expected to change the same lines won’t be worked on simultaneously
If the conflicts are stylistic churn (e.g. tabs vs. spaces), establish a project convention that is governing and use code style tools (e.g. htmltidy, perltidy, rubocop, etc.) to enforce, if necessary

Key Points

Conflicts occur when two or more people change the same lines of the same file.

The version control system does not allow people to overwrite each other’s changes blindly, but highlights conflicts so that they can be resolved.

Version Control with Git

Automated Version Control

Overview

Key Points

Setting Up Git

Overview

Line Endings

Exiting Vim

Default Git branch naming

Key Points

Creating a Repository

Overview

Key Points

Tracking Changes

Overview

Where Are My Changes?

Staging Area

Word-based diffing

Paging the Log

Directories

Key Points

Exploring History

Overview

Don’t Lose Your HEAD

Simplifying the Common Case

Key Points

Ignoring Things

Overview

Key Points

Remotes in GitHub

Overview

1. Create a remote repository

2. Connect local to remote repository

HTTPS vs. SSH

3. SSH Background and Setup

Keeping your keys secure

3.1 Create an SSH key pair

3.2 Copy the public key to GitHub

4. Push local changes to a remote

Proxy

Password Managers

The ‘-u’ Flag

Key Points

Collaborating

Overview

Practicing By Yourself

Some more about remotes

Key Points

Conflicts

Overview

Key Points