This week in anaconda #1

Posted on October 15, 2010 by Chris Lumens in work.

This is my first installment of This Week in Anaconda, where I try to explain what we’ve been up to and why things are the way they are. I hope you find it interesting, if not exactly entertaining. Hopefully I’ll get better with this over time.

anaconda is ugly and unusable

The most exciting thing to happen this week in anaconda land was the fedora-devel thread where it was basically suggested we just switch to the Ubuntu installer. I’m not going to get too into this here because I think it’s already been discussed to death.

Basically, there’s no way we will be switching to anyone else’s installer. We have a whole team working on anaconda that represents decades of experience on this code base. Why would we just throw that away? Second, Fedora uses RPMs whereas other distributions use .debs. So even switching installers would involve a lot of work to get it working with yum. Third, anaconda is an extremely capable installer. I don’t think most people understand this. We install on everything from netbooks to mainframes. You can install off a DVD, NFS, URL, or hard drive source. You can do RAID, LVM, iSCSI, multipath, FCoE, and a whole lot more. Oh, and you can completely automate the entire process. Fitting all that stuff into someone else’s code is not exactly a trivial task. I estimate it would take at least a full year to get it up to speed and what do people do in the meantime? Just not install with any of that support?

I’ll give you the point that anaconda is ugly. I’ve thought it needs a UI overhaul for years now, and we might finally be getting one. My recent changes to run in full screen mode (it’s 2010, what took so long?) really show off how much work we have ahead of us. But this is all fixable stuff. It doesn’t require throwing out the whole code base just to make it look and feel nicer.

One point that came up was why we support all these installation paths in the same program. Well, whatever we do it’s not going to involve maintaining one installer for enterprise users, one installer for laptop users, one installer for livecds, and so on. Maintaining RHEL5, RHEL6, F14, and rawhide (which is at least two, perhaps even three installers) is hard enough without adding a multiplier.

To be fair, anaconda already does kind of do multiple experiences from the same code base. If you do a livecd install, you don’t get any package selection. If you choose to only use basic storage devices, you don’t get any of the filter UI. If you don’t choose custom partitioning, you don’t get a bootloader screen. If you do a text install, you don’t get much of anything. Please don’t do a text install - use VNC instead. See, there’s another thing anaconda supports that I doubt many other installers do as well. We’ve also got install classes which allows deciding which steps to skip or show. anaconda searches for install classes in the tree it’s installing, so we can leave it up to the spin to decide.

So we’ve got a lot of capabilities here. We just need to come up with a new UI design, implement it, and tell distributors what they can do to customize the experience to the product.

No more install.img patches

I dropped a huge set of patches on the list this week that will really change installation all around. Understanding why is going to take a little back story. For as long as I can remember, anaconda has been broken up into two parts: the loader, which is a C-based program in an initrd, and stage2, which is the familiar graphical interface. stage2 has been shipped in various forms over the years, but right now it’s in install.img.

The reason for this split has been space savings. The initrd gets loaded by the kernel at bootup and sits in memory. It only contains enough stuff to find and download the much larger stage2 image. How it finds stage2 can be modified by the askmethod, stage2=, and repo= parameters in addition to booting off the DVD or CD#1. In total, it makes for some very complicated and error-prone code.

Over time, the initrd has grown and grown. This is because we try to use system components wherever possible and things like NetworkManager are necessarily much larger than our own special-purpose networking code. Also, software just tends to get larger over time. Add this to the complexity of never having the full system when we boot, and we have wanted to merge the initrd and stage2 images together for a long time.

Well I finally did it. Basically, I first found a bunch of stuff we’re doing that is unnecessary and removed it to save space. This includes shipping a 30 MB locale-archive instead of a 100 MB one, removing icon archives, and removing certain X plugins. Then, I just merged everything that was left in install.img into the initrd. Of course, the code to do all that was much larger but it did involve removing more lines than I added.

So what does this mean for the user? First, the install.img is gone and the initrd.img is much larger. Instead of 30 MB, it’s more like 100 MB. Some tftp implementations may have trouble with this. That’s about the only downside I can see. The overall set of stuff to download is significantly smaller which should make for faster downloads and reduced memory requirements. It gets rid of the obnoxious stage2= parameter that you never should have been exposed to in the first place. And most importantly, it means less code which means fewer bugs. It frees up developer time to focus on more important stuff, and it begins to make me think about getting rid of the loader entirely since we now have python in the initrd.

One final note - I originally wanted to move translations out of the initrd entirely and into their own language-specific images, but I was unsuccessful in that. The splitting out patch was easy but fetching was difficult. If you do a pxeboot and don’t give askmethod or repo=, anaconda uses the Fedora mirror system but we don’t know that information until we’re way ahead at package selection. We may need to bring up the network to fetch the language image, but we need to be able to prompt for the network in their native language. Oh well. I’m hoping to get some new ideas for this after pushing the first patch set.

F14 blocker hunt

F14 is almost over which means it’s time to panic over blocker bugs. In the past week, everyone on the team has been putting in a lot of time knocking bugs off the list. We luckily haven’t had too many this release but we also haven’t done much besides just move files around.

akozumpl cranked through a couple annoying x-related bugs, including one where hitting the up arrow tried to run gnome-screenshot. Oops. That part was new, while the other bug was caused by not cherry-picking a fix from rhel6-branch onto master. Process is hard.

I did a quick fix to make sure to bring up the network before saving a bug report via any method. Back when we used python-meh for this, we could prompt just before save-to-bugzilla or save-to-scp. But using report meant a loss of some coupling, so now we can only prompt before even asking to save. This wasn’t a very exciting patch.

As usual, there was a ton of storage stuff. dlehman and pjones worked on a whole series to a pile of components to allow anaconda to set the UUID on mpaths and RAID devices. bcl fixed rescanning disks going backwards from the upgrade screen, which is the sort of bug we see a whole lot of. And finally, dlehman also made sure we add RAID containers before RAID members.