October 2005

Different Kernel Designs Overview

Kernel terminology gets tossed about quite a bit. One of the more common topics regarding operating system kernels is the overall design. In particular how the kernel is structured. Generally, there are three major types of kernels; monolithic, microkernel and hybrid/modular.

Monolithic

A monolithic kernel is one single program that contains all of the code necessary to perform every kernel related task. Most UNIX and BSD kernels are monolithic by default. Recently more UNIX and BSD systems have been adding the modular capability which is popular in the Linux kernel. The Linux kernel started off monolithic, however, it gravitated towards a modular/hybrid design for several reasons. In the monolithic kernel, some advantages hinge on these points:

  • Since there is less software involved it is faster.
  • As it is one single piece of software it should be smaller both in source and compiled forms.
  • Less code generally means less bugs which can translate to fewer security problems.

Those points are dependent upon how well the software is written in the first place. It can be assumed that a stable kernel that has modular capability added to it will, of course, grow both in raw software terms and regarding internal communications.

Most work in the monolithic kernel is done via system calls. These are interfaces, usually kept in a tabular structure, that access some subsystem within the kernel such as disk operations. Essentially calls are made within programs and a checked copy of the request is passed through the system call. Hence, not far to travel at all.

The disadvantages of the monolithic kernel are converse with the advantages. Modifying and testing monolithic systems takes longer than their microkernel counterparts. When a bug surfaces within the core of the kernel the effects can be far reaching. Also, patching monolithic systems can be more difficult (especially for source patching).

Microkernel

The microkernel architecture is very different from the monolithic. In the microkernel, only the most fundamental of tasks are are performed such as being able to access some (not necessarily all) of the hardware, manage memory and coordinate message passing between the processes. Some systems that use microkernels are QNX and the HURD. In the case of QNX and HURD, user sessions can be entire snapshots of the system itself or views as it is referred to. The very essence of the microkernel architecture illustrates some of its advantages:

  • Maintenance is generally easier. Patches can be tested in a separate instance, then swapped in to take over a production instance.
  • Rapid development time, new software can be tested without having to reboot the kernel.
  • More persistence in general, if one instance goes hay-wire, it is often possible to substitute it with an operational mirror.

Again, all of the points are making certain assumptions about the code itself. Assuming the code is well formed, those points should stand reasonably well.

Most microkernels use a message passing system of some sort to handle requests from one server to another. The message passing system generally operates on a port basis with the microkernel. As an example, if a request for more memory is sent, a port is opened with the microkernel and the request sent through. Once within the microkernel, the steps are similar to system calls.

Disadvantages in the microkernel exist however. A few examples are:

  • Larger running memory footprint
  • More software for interfacing is required, there is a potential for performance loss (note, the QNX system is extraordinarily fast).
  • Messaging bugs can be harder to fix due to the longer trip they have to take versus the one off copy in a monolithic kernel.
  • Process management in general can be very complicated.

The disadvantages for microkernels are extremely context based. As an example, they work well for small single purpose (and critical) systems because if not many processes need to run, then the complications of process management are effectively mitigated.

Modular/Hybrid Kernels

Many traditionally monolithic kernels are now at least adding (if not actively exploiting) the module capability. The most well known of these kernels is the Linux kernel. The modular kernel essentially can have parts of it that are built into the core kernel binary or binaries that load into memory on demand. It is important to note that a code tainted module has the potential to destabilize a running kernel. Many people become confused on this point when discussing microkernels. It is possible to write a driver for a microkernel in a completely separate memory space and test it before going live. When a kernel module is loaded, it accesses the monolithic portion's memory space by adding to it what it needs, therefore, opening the doorway to possible pollution. A few advantages to the modular kernel are:

  • Faster development time for drivers that can operate from within modules. No reboot required for testing (provided the kernel is not destabilized).
  • On demand capability versus spending time recompiling a whole kernel for things like new drivers or subsystems.
  • Faster integration of third party technology (related to development but pertinent unto itself nonetheless).

Modules, generally, communicate with the kernel using a module interface of some sort. The interface is generalized (although particular to a given operating system) so it is not always possible to use modules. Often the device drivers may need more flexibility than the module interface affords. Essentially, it is two system calls and often the safety checks that only have to be done once in the monolithic kernel now may be done twice.

Some of the disadvantages of the modular approach are:

  • With more interfaces to pass through, the possibility of increased bugs exists (which implies more security holes).
  • Maintaining modules can be confusing for some administrators when dealing with problems like symbol differences.

Summary

A great deal of the advantages and disadvantages of any given design are very much dependent upon the context of the system itself. To generalize and say the monolithic kernel is always faster than it's micro or modular counterpart is somewhat of a misnomer. There are many ways to illustrate it, one good way however is to look at tuning factors.

A microkernel that is designed for a specific platform or device is only ever going to have what it needs to operate. A monolithic kernel, while initially loaded with subsystems that may not be needed can be tuned to a point where it is as fast or faster than the one that was specifically designed for the hardware, although more in a general sense.

Another context to keep in mind is the purpose. Not all servers are general, even though they can be. A small web server for example does not necessarily need massive in machine redundancy. If it is small, it should be easy to duplicate on another system or even just restore quickly from backups. A database of medical records, however, is a completely different matter. If the product itself does not have clustering and failover capability or you cannot afford it, using a modular or microkernel servers based system might be a better option.

In summary, there are lots of different kernel types and lots of different situations in which they can be employed. Hopefully the article has shed some light onto the types and a few of the arguments therein.