Private Universes of Package Distribution (January 13, 2006)

The Internet allows many kinds of sharing, and one of them is in the form of packages: clumps of content that can be installed and uninstalled, that are updated with new revisions over time, that depend on each other to function and yet potentially interfere with each other. The problem is clearest for Unix distributions like Debian, Fink, and FreeBSD ports, but it shows up for other kinds of content as well, e.g. CTAN for TeX.

How should such systems be built? I'm leaning toward the following principle:

Principle: Packages of content should assume they will only be co-installed with packages from a known group of packages.

Many authors dismiss this approach quickly. However, avoiding it means that packages are intended to be loadable along with packages that were developed completely independently. It would seem inevitable, with this opposing approach, to eventually find packages that (1) expose new bugs due combination, that were not present in the packages alone, and (2) that packages will be configured reasonably but in incompatible ways with each other. Users of the RPM format are familiar with issue 2 in particular.

To contrast, the above principle does not seem onerous in practice. Debian works that way, with most of its packages being repackaged from foreign code bases. Further, following the above principle gives a number of simplifications in other aspects of package distribution, because packages no longer have to be completely bullet-proof against each other. They no longer have to insert extra indirections everywhere simply because of the obscure possibility that some other unknown package might conflict in the most horrific way imaginable.

To read more about the idea, read the Package Universes Architecture document, which describes an abstract package-distribution system based on the principle of repository-specific packages. Scala Bazaars is an evolving implementation used by the Scala open-source community.