Build IDs for Day10

mtelvers, dra27 and I have been working on a system to build opam packages similar to the way that the docs-ci system does - effectively building a per-package binary cache to do very fast builds of the entire opam repository. It supports building even mutually-incompatible packages by dynamically creating the build environment for each package, and thus allows us to generate something akin to opam health check but much faster.

Currently the cache of a package is a key-value store where the key is a hash of the package name and version and all of its dependencies and their name and version, alongside some information about the OS. This is great when this info can uniquely identify the output, but this isn't always the case. In particular, the oxcaml opam-repository has several packages where the version number is the upstream version number with `-ox` appended, as they have patches to make them compatible with oxcaml. If these patches change without bumping the suffix the currently caching mechanism would lead to trouble. When we discussed this David pointed out the idea of the build-id in opam, which would perfectly satisfy our needs. Unfortunately this code is quite deep within the opam codebase and at the point we need it we don't have an installed opam switch, so we need to pull the code out and insert it into our project.

One of the first challenges was that day10 currently includes the OS details in the hash so that we can test across different distros. This is at odds with the opam build-id which doesn't include that, so in order to try to get as close as possible to the opam hash I split the cache into 2 layers - a per-OS cache directory containing hashes based on pure opam metadata. The idea is that these should be identical to the build-ids of opam. With that fixed, the new cache layout looks like:

debian-12-x86_64/123...abc/{build.log,config,...}

where the 123...abc should be the same as the build-id you would get with all the packages contained installed.

Now my actual use case for this is to track the state of the oxcaml world day by day, so for this I need to track both the opam-repository for OCaml and also the opam repository for OxCaml. The project currently uses a Makefile for coordinating the builds, but I thought it was time we moved on to a dedicated batch execution process. So I asked Claude to knock me up one of those, using odoc_driver for inspiration. It's very basic right now, simply iterating through the latest versions of every package, but I have got it to check on cache hits and misses, so I should be able to run it tomorrow to see how quickly we can test PRs to oxcaml/opam-repository