Piero V.

Some tricks with range-diff

The workflows we use for Tor Browser are kinda peculiar.

When you manage a fork, you need a strategy to stay updated with upstream. We create a new branch for every new Firefox release and cherry-pick our previous commits on top of it.

The advantage over merging is that our patches always stay in a single and compact block of commits. So, it is trivial to tell our commits from upstream, and users can easily verify we do not modify Firefox outside those commits (we keep the commit hashes of gecko-dev, the official GitHub mirror of Mozilla's mercurial repositories).

There are also disadvantages: keeping fewer commits helps with this workflow. Therefore, our commits have a meaning of “feature”, not of “change”. Thus, we constantly update them, and the history of a patch becomes hard to follow.

And most importantly, this workflow requires frequent rebases, with the risk of losing some changes. To mitigate it, every new branch goes through a review process before we switch to it.

The main tools we use are git range-diff to check single commits and “diff-of-diffs” to check the patch set as a whole.

The output of range-diff is hard to read, especially if you prefer vertical to horizontal diffs. Also, I still have not found a GUI that makes range-diff easier to read. So, recently, I came up with some nice tricks.

First, I found a tool called aha that can convert the output of commands with ANSI colors to HTML. You can pipe it to range-diff to get its output in an HTML file. range-diff will detect the output is not going to a terminal, so you will have to force colors with the --color option.

git range-diff --color range1 range2 | aha > range-diff.html

Also about options, git range-diff accepts the arguments of git diff, too. In particular, I find --color-moved and --ignore-space-change very useful.

Then, I wrote a short script to parse the HTML'ified range-diff and enclose each commit in a <detail> element to expand/collapse it. This was a complete game changer for me when dealing with 100-200 commits.

git range-diff --color --color-moved range1 range2 | aha | split-range-diff-aha.py > range-diff.html

Also, the script assigns a CSS class to each commit to tell whether it needs attention with colors at first glance. Commits detected as unchanged do not have any output to display. From the rest, I give lower priority to commits where upstream and downstream changes are disjoint.

At the time of writing, we are migrating from Firefox 115 ESR to 128 ESR, and we have a lot of changes in our patch set. I was in charge of the rebase work, and I had to resolve several conflicts. Therefore, I decided to add comments on the HTML generated from the range-diff to document my decisions, possibly help the reviewers understand the changes, and give more information on their causes.

Our other tool is diff-of-diffs, especially with minor updates between the same ESR version. The credit for this idea goes to Georg Koppen. We generate two .diff files from the same ranges passed to range-diff and compare them with a file comparison tool like Meld. The best-case scenario is when the two files are identical. However, you will often find differences in git hashes, but they are fine.

You might have some problems when lines starting with + or - are different or missing: it means your patch has changed.

The rationale of this global check is that range-diff works with single commits, so it will be less helpful when you squash more of them together, which we do very often on the development branch.

Finally, since we rebase frequently, we do not squash commits to their targets on the first occasion. Instead, we move a commit after the one it will be squashed into, and we keep it there for a release. In practice, we run git rebase -i --autosquash, but replace all the various fixup and squash with pick.

We do this to troubleshoot new bugs, but incidentally, it also helps the review process. The assumption is that squashing subsequent commits is trivial, and git will always do it correctly, so you can automatically create a squashed copy of the source branch with a script and feed it to range-diff. I think it could even make un-squashing commits much easier if we wanted to do it for documentation purposes.