"Linux is bloated." That's the reaction many have when they see the six CD-ROM set of Red Hat Linux Deluxe 7.1 alongside the single CD of Windows 2000 Professional, its closest proprietary competitor. This is a misconception; a typical GNU/Linux distribution includes much more functionality than a Windows distribution. (But just try to outmarket Microsoft.) Even so, I've figured out a slick way to reduce the binary footprint of a distribution:
For example, Red Hat for a given architecture has about 2 or 3 CDs of binaries and one CD of SRPMs (source packages). Why not take advantage of the fact that operating system distributions that contain software licensed under the GNU General Public License will ship with source code and a compiler? Ship binaries only of the kernel, the compiler, essential system libraries, and a bare-bones userland compressed with UPX so that it decompresses itself at runtime. Ship the rest of the distribution only as tightly compressed source code, which saves one CD right off the bat. This also allows one set of CDs to work to some extent on multiple architectures. When installing the OS, copy this minimal binary distribution to the destination partition. While the files are being copied, compile the basic packages necessary for a working command line system during installation; don't optimize better than -O.
When you restart your box, you'll have a barely usable system. The compiler will generate the rest of the system in the background, using unused CPU time à la distributed.net. Until it's done, you can play around on the command line or just go to work or let your box sit overnight (assuming a recent Athlon system; you do not want to build Linux from scratch on a 486) until the applications that you told the installer you want done first are done. If you try to start an app that isn't compiled yet, it will be pushed to the front of the queue. By the time everything's done, they'll be compiled and optimized especially for your processor's microarchitecture, producing a faster overall system. If you told the installer that you don't want to keep source around, the source would be deleted.
The general concept for this already exists as the Linux From Scratch distribution. All that remains is to automate the build process and put the distribution on CDs; an effort to do this is underway (http://alfs.linuxfromscratch.org). (In January 2002, I discovered that Gentoo Linux (http://www.gentoo.org/) and Sorcerer GNU Linux (http://sorcerer.wox.org/) had accomplished this.)
As the others mentioned, another way to eliminate bloat is to eliminate unnecessary packages. Do you really need fifteen text editors, when most of the features you need can be found in Nano or Emacs? There are versions of Slackware that fit on 15 floppies (thanks Gone Jackal). Even if you don't want to go that far, you don't need workstation packages in a server distribution or server packages in a workstation distribution (unless the workstation is the server). Why include an office suite and its associated clip art when all you want is the basic userland, Apache, MySQL, Perl, and the Everything Web System? If you're building a firewall for NAT, you need even less; xerces told me about Linux Router Project and Tom's Really Tiny Boot Disk.
Is it bloat to include multiple packages of the same purpose (KDE vs. Gnome, sendmail vs. exim, bison vs. yacc, shell fun such as ksh zsh bash tcsh ash sash), or is that just including a wide set of flavors for users? I've always thought the large variety of choices is one of the things that makes Linux fun to use.
Perhaps it is bloat to include the source. (Ok, so it's a legal requirement. Does that make it not bloat?)
Of course, while this would be very neat, it isn't really possible on small memory machines, and on high end machines, the whole install typically takes less than 5 minutes anyway, so perhaps it's just silly. I still think it'd be a cute trick.
Also, when you say the binaries are smaller, are you considering the binaries for every architecture, or just your favorite one? Wouldn't it be cool to have a distribution with minimal binaries for every architecture, and then source, and an automated complilation process for the rest?
Of course, this wouldn't work for packages that can't have their compilation process automated, but then, I don't think this really applies to most things that have rpm's.
As to the CD whirring all night or the HD being full, the obvious solution to this is to copy the source you need from the cdrom, compile it, and delete it before going to the next package. I don't think this would be a big deal or a significant performance penalty.
The argument for instant usability and slowness on many machines is a very good point. It costs less than a dollar to produce and mail two cdrom's anyway, so what's the savings?
printable version chaos
Everything2 Help