UNB/ CS/ David Bremner/ blog/ posts/ Yet another tale of converting Debian packaging to Git

racket (previously known as plt-scheme) is an interpreter/JIT-compiler/development environment with about 6 years of subversion history in a converted git repo. Debian packaging has been done in subversion, with only the contents of ./debian in version control. I wanted to merge these into a single git repository.

The first step is to create a repo and fetch the relevant history.

TMPDIR=/var/tmp
export TMPDIR
ME=`readlink -f $0`
AUTHORS=`dirname $ME`/authors

mkdir racket && cd racket && git init
git remote add racket git://git.racket-lang.org/plt
git fetch --tags racket
git config  merge.renameLimit 10000
git svn init  --stdlayout svn://svn.debian.org/svn/pkg-plt-scheme/plt-scheme/ 
git svn fetch -A$AUTHORS
git branch debian

A couple points to note:

Now a couple complications arose about upstream's git repo.

  1. Upstream releases seperate source tarballs for unix, mac, and windows. Each of these is constructed by deleting a large number of files from version control, and occasionally some last minute fiddling with README files and so on.

  2. The history of the release tags is not completely linear. For example,

rocinante:~/projects/racket  (git-svn)-[master]-% git diff --shortstat v4.2.4 `git merge-base v4.2.4 v5.0`
 48 files changed, 242 insertions(+), 393 deletions(-)

rocinante:~/projects/racket  (git-svn)-[master]-% git diff --shortstat v4.2.1 `git merge-base v4.2.1 v4.2.4`
 76 files changed, 642 insertions(+), 1485 deletions(-)

The combination made my straight forward attempt at constructing a history synched with release tarballs generate many conflicts. I ended up importing each tarball on a temporary branch, and the merges went smoother. Note also the use of "git merge -s recursive -X theirs" to resolve conflicts in favour of the new upstream version.

The repetitive bits of the merge are collected as shell functions.

import_tgz() { 
    if [ -f $1 ]; then 
        git clean -fxd; 
        git ls-files -z | xargs -0 rm -f; 
        tar --strip-components=1 -zxvf $1 ; 
        git add -A; 
        git commit -m'Importing '`basename $1`;
    else
        echo "missing tarball $1"; 
    fi; 
}

do_merge() {
    version=$1
    git checkout -b v$version-tarball v$version
    import_tgz ../plt-scheme_$version.orig.tar.gz
    git checkout upstream 
    git merge -s recursive -X theirs v$version-tarball
}

post_merge() {
    version=$1
    git tag -f upstream/$version
    pristine-tar commit ../plt-scheme_$version.orig.tar.gz
    git branch -d v$version-tarball
}

The entire merge script is here. A typical step looks like

do_merge 5.0
git rm collects/tests/stepper/automatic-tests.ss
git add `git status -s | egrep ^UA | cut -f2 -d' '`
git checkout v5.0-tarball doc/release-notes/teachpack/HISTORY.txt
git rm readme.txt
git add  collects/tests/web-server/info.rkt
git commit -m'Resolve conflicts from new upstream version 5.0'
post_merge 5.0

Finally, we have the comparatively easy task of merging the upstream and Debian branches. In one or two places git was confused by all of the copying and renaming of files and I had to manually fix things up with git rm.

cd racket || /bin/true
set -e

git checkout debian
git tag -f packaging/4.0.1-2 `git svn find-rev r98`
git tag -f packaging/4.2.1-1 `git svn find-rev r113`
git tag -f packaging/4.2.4-2 `git svn find-rev r126`

git branch -f  master upstream/4.0.1
git checkout master
git merge packaging/4.0.1-2
git tag -f debian/4.0.1-2

git merge upstream/4.2.1
git merge packaging/4.2.1-1
git tag -f debian/4.2.1-1

git merge upstream/4.2.4
git merge packaging/4.2.4-2
git rm collects/tests/stxclass/more-tests.ss && git commit -m'fix false rename detection'
git tag -f debian/4.2.4-2

git merge -s recursive -X theirs upstream/5.0
git rm collects/tests/web-server/info.rkt
git commit -m 'Merge upstream 5.0'