This week in Anaconda #2

Posted on October 22, 2010 by Chris Lumens in work.

The surprise controversy this week was that I removed /usr from the list of default mount points in the UI. That doesn’t mean you can’t still use it. It just means you have to type it in manually as opposed to choosing it from a drop down. The rationale is that Fedora in general does a pretty bad job of supporting /usr on its own mount point. We do such a bad job that even the install guide recommends against it.

Well, someone actually read the anaconda changelog (which is probably the most surprising of all) and decided to comment. The whole mail thread was focused on whether or not Fedora as a whole should allow /usr on its own partition so it was really only tangentially about anaconda. We largely stayed out of the conversation and it kind of died without any real conclusion. The more interesting parts of the thread were about per-user /tmp which doesn’t really have anything to do with the initial post.

Personally, I feel that a separate /usr is an historical relic and a niche use case, but again I haven’t done anything to prevent you from using it. And if Fedora one day has proper support, I can always add it back into the drop down as a default suggestion.

There is only tsort

dlehman dropped a big pile of patches on the list yesterday that I am pretty excited about. This is real deep down internal stuff that no user will ever notice, but is important stuff. Several Fedora releases ago, he led us in a massive storage rewrite that involved getting rid of almost all of our old crufty storage code and replacing it with something much more OO in design. You may remember this as a time of great upheaval and bug reports as we figured out what we were doing. But in the end, I think it’s resulted in a much higher quality partitioning system.

Well one of the main abstractions added in the storage rewrite was the concept of an Action. There are lots of kinds of Actions (as you can see for yourself), but they really come down to creating, deleting, and resizing filesystems and partitions. Despite all the fancy technologies these days, that’s really all the storage code does.

Now, Actions are not done as soon as they are requested. When you tell anaconda to create a /dev/sda1 partition of 500 MB and put /boot on it, it doesn’t go off and do that right away. It creates at least two actions (a create for /dev/sda1 and a create for the /boot filesystem) and adds them to a tree of actions to be performed. It’s only after you do all your other partitioning selections and press “Write to Disk” that the Actions get performed. And of course, it’s not quite that simple. As you can imagine, Actions have some specific order to them. You can’t create /boot before you create /dev/sda1. Creating /boot might also require deleting whatever was there beforehand. This ordering can get quite complex in a hurry, depending on what you’ve got on the disks and what you tell anaconda to do. RAID and encryption adds a whole new layer of Actions on top, for instance.

Figuring this order out requires a sort function. Unfortunately, our existing Action sorting function is pretty rough. We were in a hurry for a lot of the storage rewrite towards the end, so the sort function just sort of grew and grew until it became really hard to follow. Also, as I have implied from calling it a “sort function”, it was really just one giant function instead of using proper OO design to separate out the sorting concepts into each individual Action class. The latter is how you commonly see things designed, especially in Python by giving every class a __cmp__ method.

dlehman’s patches finally kill the horrors of the old pruneActions and sortActions, moving the logic of action comparison into requires() and obsoletes() methods on each Action class. It’s then a simple matter of using the One True Sort - the topological sort (or tsort) - on all the Actions and running the result. tsort is a very well understood sorting algorithm that converts a tree into a list, so it does exactly what we want.

The patch set introduces many more lines than it removes, but you need to look closer. The majority of the additions are test cases, while the actual code size shrinks. I also expect this to fix up a lot of bugs where we try to make things out of order, too. It should at least make them much easier to spot.

A crazy new idea

What’s a work week without some big new idea that will change everything around? This time, it’s killing the split media installs. Split media is simply the six or seven disc CD set. You may be wondering what could possibly be wrong with split media and if we’re just bored and looking for code to delete. While it’s true that we’re always looking for code to delete (fewer lines = fewer bugs), split media does have some serious problems.

First, it causes problems for anaconda. When you install packages, they get grouped together into an RPM transaction. Within that transaction, we can check that there aren’t two packages with the same files (a file conflict), that you have enough space to install everything in the set, and so on. Installing from the DVD is one single large transaction. Every use of yum to install on the command line is an individual transaction. CDs complicate the situation. Because each CD only contains some of the packages, each CD ends up being its own transaction.

This is a big problem. It means that if a package on CD#2 has the same file as a package from CD#1, we can’t detect that until we’re processing CD#2. Or, we could be in the middle of installing from CD#5 when you run out of disk space. Oops - nothing we can do about it now except tell you to reboot and pick less stuff next time. There’s a bug for this somewhere, but I cannot find it.

Second, it causes problems for rel-eng. The above transaction stuff gets even more annoying when you consider that certain packages have scripts that run at the end of the transaction and expect everything needed for the script to run to have been installed. A prime example of this is the kernel and dracut. In order to make the initrd so you can boot after installation, every program needed to access your root filesystem must have been installed as part of the same transaction as the kernel and dracut. This includes LVM (default partitioning does LVM, after all), perhaps RAID tools, perhaps encryption tools, perhaps multipath tools, and so on. If all that stuff isn’t on CD#1, you are left with a system that doesn’t boot.

The fix is to have the distribution composing tools force certain packages to be on CD#1. And any time you have a hard-coded list in a program, you’re asking for maintenance troubles in the future. There’s also a cascading problem here where forcing some packages onto CD#1 makes that disc larger than the 700 MB CD size limit. Hopefully some packages can spill over to CD#2, but this may not be possible due to package dependencies. In that case, the only option is to remove some packages from the release media.

We always hit these problems at the very last minute, too. It’s just time for split media to go. We currently offer the complete DVD, a single live install CD, and a minimal boot.iso that can be used to network install. My hope is that the split media user base is getting smaller all the time as they look at these other methods that require fewer discs.


As a followup to last week’s edition, it’s worth mentioning that I committed my series of install.img killing patches. So, all rawhide trees and F15 trees will ship without that file. Don’t panic if you don’t see an install.img.