mathr / blog / #

code.mathr.co.uk

gitorious.org will shut down end of May, having been acquired by gitlab.com. Rather than migrate my repositories to GitLab, I decided to host myself:

code.mathr.co.uk

Making git repositories available over "dumb HTTP" is fairly easy:

cd /var/www/my-web-site/some/path
git clone --bare some-repository some-repository.git
cd some-repository.git
cat > hooks/post-update << EOF
#!/bin/sh
exec git update-server-info
EOF
chmod +x hooks/post-update
./hooks/post-update

Now anyone can get your repository:

git clone http://my-web-site/some/path/some-repository.git

The post-update hook ensures that when authorized users push (using SSH, not HTTP), the repository updates are available over "dumb HTTP".

Adding a human-friendly user interface is more tricky. I used gitweb.cgi. I based the Apache2 configuration on one of the examples in its manual. I removed the .git suffix from the repositories in the web server directory, and added an AliasMatch directive to detect URLs with ".git" in them, to rewrite them to the correct path. Then I enabled the gitweb CGI script, so that "http://code.mathr.co.uk/repo" gives the human-friendly interface. Appending ".git" gives the machine-friendly repository.

Important security note: naive use of AliasMatch opens a directory traversal vulnerability. Put this at the start of your virtual host configuration, and later explicitly allow any directories to which you want the web server to have access:

<Directory />
  Order allow,deny
  deny from all
</Directory>

Then I needed to migrate my repositories from gitorious.org. The human-friendly gitorious website has a table listing all the repositories owned by my user, but I couldn't find a machine-friendly version. So I scraped it with a little script:

username="claude" # change this line
wget -O - --header "Accept: text/html" "https://gitorious.org/~${username}" |
grep -A 1 'class="repository"' |
grep href |
sed 's|^.*"/\(.*\)/\(.*\)".*$|\1 \2|' |
sort > repositories.txt

I wanted gitweb to show descriptions for each repository, so I edited the file to add a category (one word) and a description (the rest of the line). I wrote another script to clone all the repositories from gitorious.org:

source="https://gitorious.org/"
cloneurl="http://code.mathr.co.uk/"
while read project repository category description
do
  git clone --bare "${source}${project}/${repository}" "${repository}"
  pushd "${repository}"
  echo "${cloneurl}${repository}.git" > cloneurl
  echo "${category}" > category
  echo "${description}" > description
  cat > hooks/post-update << EOF
#!/bin/sh
exec git update-server-info
EOF
  chmod +x hooks/post-update
  ./hooks/post-update
  popd
done

Note that this flattens the directory structure, which suited me fine, but your needs may differ. I also manually renamed a few projects after the fact, and chose not to migrate a couple (one empty, one huge and mostly not my code). You can change "source" to "git@gitorious.org:" to clone via SSH, and omit "--bare", to make a local backup that you can still push to gitorious.org from. Look into ssh-agent if you have many repositories and don't feel like entering your pass-phrase for each one.

The final step was to generate an atom feed aggregating the feeds for each repository that gitweb generates:

http://code.mathr.co.uk/commits.atom

The aggregator itself is at gitweb-aggregator.

Now I just need to sed the internet to update all the soon-to-be-broken links to my old gitorious.org content...