Some results from the previous post. This time I've run day10 on 144 or so commits from opam-repository to see how well the cache performs. The results are quite interesting.
First let's talk about the "examination map". This is a map from package name to a list of other packages whose solutions should be recalculated if the package in question is altered. It's built by first looking at the packages that the solver asks about during the solution for a package, and then taking all of the solutions, and 'inverting' the map, so for example, if both packages 'a' and 'b' ask about package 'c' during their solutions, then altering 'c' means that the solutions for both 'a' and 'b' need to be recalculated. The examination map entry for 'c' would then be 'a'; 'b'
. We can plot the histogram of the sizes of each entry in the examination map:
Some interesting features from these data:
dune
and its dependencies and associated packages. There are around 350 such packages, and any change to these means we need to recalcuate most of the solutions.This last point doesn't mean that we actually recompile 3,800 packages, just that we need to recalcualte the solution, which might then lead to a cache hit of the layer and no actual compilation. However, recalculating the solutions of all of the packages takes (on my computer) around 10,000 seconds, or roughly 5 minutes of wall-clock time as I've got 32 threads.
However, if the package that's changes isn't one of those 350 packages, then the number of solutions that need to be recalculated is dramatically reduced. I ran the logic over the last few weeks of commits to opam-repository, from commit 109398e2fd61803126becd398df0f1eabc9f3ca2
of the 10th September up until commit 3f21ebe342ce440d9c9142ffe1185d8e5a326085
from the 22nd. In this time there were 144 commits (counting only those from git log --first-parent
). Of these, only 4 resulted in a full resolve - the first commit, since obviously we have no cache at that point, the release of OCaml 5.4.0 beta2 by Florian Angeletti, a fix of ocaml-base-compiler for MSVC by David and a fix for BER-OCaml by Jeremy Yallop. Then 25 commits resulted in recalculating solutions for 3800 packages as they hit dune-adjacent packages, 5 commits resulted in recalculating between 100 and 300 packages and the remaining 110 commits resulted in recalculating fewer than 100 packages, the majority of which resulted in recalculating fewer than 5 packages.
Overall, at a rough estimate, this means that over this period, using this caching strategy gave us a 5x speedup in the solver!