Some projects prefer to use patches and email to handle contributions. It's a fairly unusual workflow in the days of Github, but it's used by big projects such as the Linux kernel. Recently I've been doing some packaging for the GNU Guix project. As they use this workflow I've had to become familiar with it. It was quite complicated to figure out all the individual steps and it took quite a bit of reading - hopefully this blog post will shorten the learning curve for someone else!
We might wonder what the advantage is to this workflow, when we could just use pull requests. Drew Devault's post The advantages of an email-driven git workflow covers this far better than I can. One way to look at it is that this workflow is like interactive rebase as each patch can be applied to any branch, and we're using email rather than publishing to a location. We'll start by looking at the main tools git format-patch, git am and git send-email - then I'll show a specific example workflow.
We use git format-patch to create a patch (or patches) from our commits. The command takes each (non-merge) commit and forms it into a separate patch ready for email (in a UNIX mbox format). The basic command is:
git format-patch <branch> <options> git format-patch master
The second command will extract all the commits that are in the current branch and that are not in the master branch. For each commit a separate patch file is created: they have the format 0001-<first-line-of-commit-message>.patch.
We can specify a commit that we want used since - this will be from this commit to the tip of the current branch. We can also specify a revision range to use (see gitrevisions) .
Using a feature branch is easiest, and it makes it simple to then squash any intermediate commits and extract a single patch:
git format-patch -1 --base=master
Here we're using the last commit on the current branch: to go back three commits we use -3 etc. The --base=master adds the commit on master that the branch was created from into the patch, this is useful to the maintainer as they then know the point in the history when your patch applied cleanly. If on a feature branch it's easy to check what the difference is with:
git diff --oneline --graph master
Sometimes, we want to place the patch somewhere else, this is done with --output-directory:
# create the patch and put the output in the directory above git format-patch master --output-directory ../ # create a single patch from the last commit # this assumes that you did a git rebase -i HEAD~3 or whatever git format-patch -1 --base=master --numbered --output-directory ../
The most likely way of working is in a feature branch and doing commits as you go along. Then at the end use a rebase to squash commits into a single clean commit. Finally, use format-patch to create the patch file.
Often we'll have just one patch, but sometimes a change is sufficiently complicated that it requires multiple patches. We want each commit to be a specific change: this might involve squashing intermediate commits using an interactive rebase. When complete there should now be a logical stack of commits, where a maintainer can look at each patch and that change is complete on its own basis.
At this point we have a set of commits, so we can create a set of numbered patches with:
git format-patch --base=master --numbered --output-directory ../
The --numbered option ensures that each patch will have [PATCH N/M] in the header: I generally like this option and use it even if it's just one patch.
It's possible to send a --thread option which creates a thread, so the first patch is the top and the following patches are replies to the first one. It's not needed from what I can tell as git send-email does this by default anyway.
If there's a series of patches and some discussion around them, then at some point we need to send a revised patch set:
git format-patch --base=master --numbered --thread --reroll-count=2 --output-directory ../
This will change the patch set so something "Subject: [PATCH v2 1/4] blah", and the patch file itself will be v2-0001-blah.patch. For much more discussion of this see Maintaining feature branches and submitting patches with Git by Peter Eisentraut.
This can also be done in git send-email but it's best done directly in format-patch as it's clearer what is happening.
Often we want to send a note with our patch to describe it or to ask a question. There's a couple of options for this, if it's something complex then we can send a cover letter:
git format-patch -1 --base=master --cover-letter --numbered --output-directory ../
Due to the --cover-letter option this will generate two files (0000-cover-letter.patch and 0001-<git title>.patch). The cover letter patch file has some information about the patch (but not the patch itself) and room to add comments etc. This is useful if there's going to be a general discussion about the patch set.
The other option is to put the comments within the patch file itself, using a scissors line. A scissors line is plain text in the patch with -- >8 across the line. For example, we create a patch like this:
git format-patch -1 --base=master --output-directory ../
Then edit the patch file and add things like this:
<headers of the email are here> Subject: [PATCH] <whatever the subject is> * Items here will not be included in the patch. * The patch has to be processed with git-am --scissors -- >8 -- >8 -- >8 * some/path/file: <first line of the commit> <the rest continues>
The important part is that anything before the scissors line (-- >8 ) will be ignored by git am as long as --scissors is passed. To apply the patch do:
git am --scissors 0001-some-patch-file.patch
The patch will be applied and git log will demonstrate that any comments before the scissors line were ignored. I really like this technique when there's just a single patch: though it's a bit annoying that the other end of the exchange has to know to process the patch with the --scissors option.
The git am command is used to apply a patch that is in a mailbox. It's used when a patch was created by using git format-patch. It applies a series of patches from a mailbox [1]. Before sending a patch to a maintainer it makes sense to test that it will apply cleanly: commonly we create a test branch and then test apply our patch:
git am <mailbox> git am ../0001-gnu-freeciv-Update-to-3.0.0.patch
This command has a lot of functionality for dealing with patches that don't apply cleanly. Our aim is to send a patch that applies easily so we'll skip learning more here - if the patch isn't applying cleanly I redo it!
The git send-email [2] command takes patches and emails them: the patches are generated by git format-patch. This is the last part of the stack, and the complexity of setting this up will depend on your email system.
The command is git send-email - it's an extensions to git so has to be installed.
apt install git-email
Configure the email sending capability to use the correct name and email address for the user:
git config --global user.name "My Name" git config --global user.email "myemail@example.com"
At this point we have to configure the MTA options: which is one of those sentences where if you don't know what an MTA is then there's potentially going to be a lot of trouble and Googling happening!
The git send-email command is compatible with either a local MTA or a remote one (such as Gmail) - the best way to get configuration information is the git-send-email-to site [3] . For many people using a Smarthost like Gmail is going to be the easiest option.
In my case I'm using Msmtp - it's a simple, light-weight MTA that sends email out when I'm online. So I do the following:
[sendemail] smtpserver = /home/user1/bin/msmtpq smtpserveroption = --account smtpserveroption = runbox
The basics of using the git send-email command are:
git send-email --to=test@test.com --dry-run ../0001-gnu-blah.patch
The --to= option lets us specify who to send the patch to, we can also use --cc= and these can be specified multiple times.
The --annotate option opens up an editor so we can review the patch before it's sent. There's also a --compose option that lets you write an introductory email - but the git format-patch option of --cover-letter is better because the numbering in the Subject line will be correct.
The --dry-run option runs the email command to see if everything is working, but doesn't send the email.
Rather than having to remember long commands it's possible to configure git send-email for a specific project. This takes advantage of git's ability to store configuration in the projects ~/.git file.
For each project we can configure the specific list to send to:
cd <some/path/some/repo> git config sendemail.to blahblah@blah.com # for the guix patches email I set git config --local sendemail.to guix-patches@gnu.org # each email is sent as a reply to the previous email git config sendemail.chainReplyTo set # automatically annotate git config --local sendemail.annotate yes
This means that the command can be shortened to:
git send-email --dry-run ../0001-<patch name>.patch
Having reviewed all the components in the workflow we can now run through all the steps in order.
To use this workflow we create a feature branch and develop our change there. When the feature is ready we squash changes into a single commit so that we can provide a single patch file. We then create the patch, and finally send it to the maintainers list.
$ git checkout master $ git checkout -b new-feature-branch
# stage files as we change them git stage some/path/some-file # create our commit using the format the project specifies git commit git rebase --interactive
git format-patch -1 --base=master --numbered --output-directory ../
# change back to master from our feature branch git checkout master # create a new test branch git checkout -b test-patch-branch # apply the patch to the new patch git am ../0001-patch-whatever.patch # check that it's what we expect git log git log master..HEAD --stat # compare it to the HEAD of the master # the master... is a special shortcut git provides git diff master... path/to/file/ # Do a test build make
git send-email --annotate --thread --dry-run ../0001-<patch name>.patch
If we configured a local sendemail.to for this project then it knows where to send the email. Otherwise we can provide a --to=<address>.
git branch -D <branches used>
[1] | See the docs for the Git am command |
[2] | Git docs git-send-email |
[3] | The git-send-email.io site has lots of details on how to set-up git send-email whether using a local MTA or a remote smarthost like Gmail. |
The nice part of this patch workflow is that it's very suited to a disconnected work style where discussion is handled through email: if you're not online with GitHub all the time this is nice. That said patch files are a bit of a pain as the changes become complex, and email isn't the centre of everyone's universe. Maybe it's me, but I found figuring out the individual elements quite complicated - having now got them organised into a recipe it feels straightforward enough. Like everything practise makes perfect!
Did you find this tutorial useful, any missing steps or items I should have added? Maybe you love/hate this workflow? In either case, drop an email or write a comment!