Thursday, July 30, 2009

KERNEL THREAD

Kernel threads consist of a set of registers, a stack, and a few corresponding kernel data structures. When kernel threads are used, the operating system will have a descriptor for each thread belonging to a process and it will schedule all the threads. Unlike processes, all threads within a process share the same address space. Similar to processes, when a kernel thread makes a blocking call, only that thread blocks. All modern machines support kernel threads, most often via the POSIX threads interface ``pthreads''. Some dedicated parallel machines support kernel threads poorly or not at all. For example, the Blue Gene/L microkernel does not support pthreads.

The purported advantage of kernel threads over processes is faster creation and context switching compared with processes. For shared-memory multiprocessor architectures, the kernel is able to dispatch threads of one process on several processors, which leads to automatic load balancing within the nodes. For parallel programming, threads allow different parts of the parallel program to communicate by directly accessing each others' memory, which allows very efficient, fine-grained communication.


Kernel threads share a single copy of the entire address space, including regions such as global data that may cause conflicts if used by multiple threads simultaneously. Threads can also cause unintentional data sharing, which leads to corruption and race conditions. To avoid this unintentional sharing, programs must often be modified to either lock or access separate copies of common data structures. Several very widely used language features are unsafe when used with threads, such as the use of global and static variables, or the idiom of returning a reference to a static buffer. Especially with large existing codebases with many global variables, this makes kernel threads very difficult to use because in most implementations of kernel threads, it is not possible to assign each thread a private set of global variables.

Kernel threads are considered ``lightweight,'' and one would expect the number of threads to only be limited by address space and processor time. Since every thread needs only a stack and a small data structure describing the thread, in principle this limit should not be a problem. But in practice, we found that many platforms impose hard limits on the maximum number of pthreads that can be created in a process. Table 2 in Section 4 shows the practical limitations on pthreads on several stock systems.

In particular, operating system kernels tend to see kernel threads as a special kind of process rather than a unique entity. For example, in the Solaris kernel threads are called ``light weight processes'' (LWP's). Linux actually creates kernel threads using a special variation of fork called ``clone,'' and until recently gave each thread a separate process ID. Because of this heritage, in practice kernel threads tend to be closer in memory and time cost to processes than user-level threads, although recent work has made some progress in closing the gap, including K42 [
5] and the Native POSIX Threading Library (NPTL) and Linux O(1) scheduler.

No comments: