Linux Journal, Issue 9, January 1995

Unix Systems for Modern Architectures
Curt Schimmel
Addison-Wesley, 1994, ISBN 0-201-63338-8
"What is involved in a multi-processor version of Linux?" has almost become a "Frequently Asked Question" in the Linux newsgroups. The answer is contained in Curt Schimmel's UNIX Systems for Modern Architectures.

Schimmel can speak from experience on this topic. He has worked on Unix systems at AT&T Bell Laboratories and at Silicon Graphics, Inc. and has offered tutorials on symmetric multi-processor Unix systems at USENIX and UKUUG. This book is an outgrowth of those tutorials.

At first glance the book seems to offer too much detail about hardware for a programmer. But as one proceeds, one sees that understanding the subtleties of hardware-cache-bus-memory interactions is an essential component to "doing" a kernel for a multi-processor system.

After a brief (17 page) description of Unix processes, the next 130 pages are devoted to discussing uni-processor cache systems. I was surprised and delighted to find how hard it can be to get the right results. Fortunately folks seem to have got this right the systems I've used.

With this foundation well established, the remainder of the book deals with the new domain of multi-processor systems.

The key to any such system is protecting shared data and efficient interprocess communication. Mutual exclusion mechanisms are cast in three forms--short term, medium term, and long term. We are shown how uni-processor implementations of Unix depended on a single-threaded kernel and interrupt masking to protect shared data, and more importantly, we are shown how these methods are inappropriate for a multi-processor system.

Schimmel shows how one can build locks for all three levels of mutual exclusion (and points out where they are needed in a typical Unix kernel). Although the master/slave scheme is straightforward to implement, it has much the flavor (and bottlenecks) of a uni-processor system. The more promising symmetric multi-processor is not as easy to do right. The essence of the problem is finding the right granularity (or size) for the critical sections. Granularity that is either too large and too small can harm system performance. We are shown the analysis that leads to good designs.

The book concludes with more memory access and caching issues --this time with multi-processor systems. Some recent RISC chips have memory models which allow for stores and loads to be re-ordered from what the programmer intended in order to gain performance. We are shown how RISC chips have mechanisms to force the correct results for implementing locks and accessing data in critical sections. Even when memory requests are issued in the order they were programmed, cache consistancy is a serious issue in multi-processor systems. The final chapters of the book address the interactions must be addressed by a serious system designer.

This book is written as a textbook, with questions and references at the end of each chapter. Selected questions have answers provided in an appendex. Another appendex summarizies a dozen popular chips found in Unix systems.

Randolph Bentson is a computer science consultant located in Seattle. He has worked with a number of multi-processor Unix systems, including Hewlett-Packard HP-UX, Sequent Dynix, and Denelcor HEP-UPX. (He wrote the single-threaded bootstrap code for the latter.) His Ph.D. in computer science involved implementing a functional programming language for multi-processor system. He can be reached at bentson@grieg.holmsjoen.com, his Linux system at home.