The Components Utopia (February 9, 2006)

The components utopia is a mythical components system that allows any two components that should be able to work together, to in fact work together without any modification. Such a system is impossible, but its idea is as seductive as a perfect halting-problem detector or a perfect OS scheduler.

Components utopians say things like, different components may have been compiled against different versions of their dependencies. Therefore, any decent components system ought to support simultaneously loading different versions of the depended-on components. Non-utopians say things like, we should consider supporting multiple versions, but we are also free to say that if two components depend on different versions of another component, then those two components simply do not work together. Utopians work with a series of ultimatums in a design space that, in the end, is empty. Non-utopians admit that some components do not work together, and thereby explore a rich, non-empty design space.

It is fundamentally impossible to completely isolate two components from each other, such that they cannot possibly cause each other harm, while still allowing them to work together. Components are supposed to work with each other--after all, if I install a library, I want the other programs on my system to use that library.

Thus, you have to leave up some channels of interaction between components, and every such channel is a possibility for components to harm each other. Even speaking through a perfectly type-checked API leaves open questions like, is 0 a valid argument for this or that method? Even if you use a theorem prover on all of your components, you merely defer the problems to problems of specifying what properties, exactly, have been proven. Components that should work together sometimes simply are not going to. You have to have some mechanism for dealing with this situation and tweaking one package or the other so that they work together.

Components utopians hope that this override mechanism is not needed. They leave it out, thus causing every possible form of conflict to be a show stopper. They notice that different components have been compiled against different versions of each other, and they conclude that their system has to allow different versions of components to be simultaneously installed. They notice that sometimes they might make conflicting changes to a global namespaces, and thus conclude that you cannot have any global namespace at all. The list of possible sources of incompatibility is unending, and thus components utopians explore an empty design space. No system can possibly prevent all harms.

Therefore, a components system should include an override mechanism from the beginning! You immediately arrive at a working (though not necessarily convenient) system, because any compatibility problem can be resolved via the human override mechanism. Further, the ultimatums that components utopians face, turn into design possibilities. Instead of having to support multiple simultaneous versions of packages, one can support it. Whether to do so or not becomes a matter of trade offs. Is it more hassle to support multiple simultaneous versions, or to manually fix up the components so they can all work with one canonical "current" version? The answers to such questions are not obvious.

The package universes architecture incorporates an override mechanism based on human packaging efforts. Packages in Debian/unstable work together because humans do not post a package until they have made any small corrections that are necessary for the package to coexist with existing packages in the repository. Carefully written packages are careful with their use of the filesystem's namespace and are careful in how they update their API, and thus are installable in many Linux distributions with relatively minor repackaging efforts. Sloppily written packages are more difficult to work with, but they can still be posted -- the repackaging effort is simply greater.

The package universes architecture by itself does not specify nor rely on any of the interesting mechanisms that component system designers have devised--it only insists that there is some notion of installing and uninstalling a package. However, the architecture can very well take advantage of them. It becomes a question of optimization: Better component systems reduce the effort of repackaging, but you can get by with a poor one.