April 2006
Part I of aggregate optimization and tuning covered the basic steps needed to recompile four key pieces of system software; the Linux kernel, the Perl interpreter, the bash interpreter and the GNU Compiler Collection (gcc). In this second part, tuning and installing the Linux kernel on a Debian distribution and general ideas about test cases, will be examined.
Tuning the Linux kernel is well documented on the internet [ 1 ]. Within this context just the process itself and a few examples of what was changed are addressed.
The following packages are needed to recompile a supported Debian source tree [ 2 ].
Plus of course, which source tree will be used [ 3 ].
The basic process is:
cd /usr/src/linux-source-2.6.xx make-kpkg clean make menuconfig vi Makefile fakeroot make-kpkg --initrd --revision=custom.x.x kernel_image dpkg -i ../linux-image-2.6.xx-j01_custom.x.x_i386.deb
In editing the Makefile and extra version can be added
(in this case -j01 in order to keep the different kernels
distinct. Also note the custom_x.x, that is used by
Debian and should be incremented as well. If there are any problems
with the kernel image it can be removed later via dpkg.
Some of the changes that were made took out unused filesystems, entire subsystems that are not needed (such as hamradio for example) and drivers that were built in by default but not required. The result was a smaller kernel (about 1/2 the size), however, the ultimate goal is to see how - if at all - compiled in optimization helps.
Not unlike the scope of software, limiting the test cases by both scope and type will be used. The problem to tackle is not what the cases should do but writing them. There are many high quality load creation tools out there; since the scope is limited and there is specific software to test they will have to be written.
There are a variety of subsystems that could be tested, but for testing compiler optimization (and to keep the testing for a small series like this down :) two areas relative to processor usage and speed can be examined:
Some subsystems, such as virtual memory, are already taking a hit because the recompiled software is using a larger area of memory for the optimizations. There is a paradox of sorts when it comes to optimizing software by using methods like unrolling loops, is it possible to reach a point of diminishing returns - the answer of course is certainly. The very reason that this series has a limited focus is to avoid running into any brick walls and to only see if optimization truly helps - if even a little.
Handling context switches is not too difficult, to create a false load a simple matter of insane recursion can be used. Calculations might prove a little difficult. One approach might be to use a simple calculation that requires a lot of work - division. Computers do division using the subtraction method which in turn requires a lot of register shifting. The best approach is simple, combine the recursion with division in the routine. As each recursive hop to a function occurs, it saves and changes context, then has to deal with a division problem, then call itself until the base case is reached.
With a lighter, optimized kernel, Perl and Bash interpreter and an optimized compiler ready to go, the next part of the series will tackle what the tests look like and how they turned out.
(based on last 2 months log reports)