Part I Introduction.- 1 Introduction to HPC Operating Systems.- Part II Lightweight Kernels.- 2 Overview: The Birth of Lightweight Kernels.- 3 Sandia Line of LWKs.- 4 Hitachi HI-UX/MPP series.- 5 Blue Gene Line of LWKs.- Part III Unix/Linux based Systems.- 6 Overview: The Rise of Linux.- 7 Cray Compute Node Linux.- 8 SCore.- 9 NEC Earth Simulator and the SX-Aurora TSUBASA.- 10 ZeptoOS.- 11 K Computer.- 12 Argo.- Part IV Multi-Kernels.- 13 A New Age: An Overview of Multi-Kernels.- 14 FusedOS.- 15 Hobbes: A Multi-Kernel Infrastructure for Application Composition.- 16 NIX.- 17 IHK/McKernel.- 18 mOS for HPC.- 19 FFMK: An HPC OS based on the L4Re Microkernel.- 20 HermitCore.
About the Author: Dr. Balazs Gerofi is a research scientist at the RIKEN Center for Computational Science, where he is involved with system software research and development for high performance computing. He actively participates in the design and development of the Post K supercomputer, Japan's next-generation flagship supercomputer after the K Computer. Balazs earned his M.Sc. degree and Ph.D. degree in computer science from the Vrije Universiteit Amsterdam and The University of Tokyo, respectively. His research interest covers operating systems, high performance computing, cloud computing, and fault-tolerant computing. Balazs is a member of the IEEE Computer Society and the Association for Computing Machinery (ACM).
Dr. Yutaka Ishikawa is the leader of the Post-K computer development project that aims at deploying the next Japanese flagship supercomputer around 2021, at the RIKEN Center for Computational Science, Japan. Ishikawa received his Ph.D. degree in electrical engineering from Keio University. From 1987 to 2001, he was a member of AIST (the former Electrotechnical Laboratory). From 1993 to 2001, he was the chief of the Parallel and Distributed System Software Laboratory at the Real World Computing Partnership. He led the development of the cluster system software called SCore, which was used in several large PC cluster systems around 2004. From 2002 to 2006 and from 2006 to 2014, he was an associate professor and a professor at The University Tokyo, respectively. From 2006 to 2008, he was a project co-leader to design a commodity-based supercomputer called T2K open supercomputer. As a result, three universities, Tsukuba, Tokyo, and Kyoto, obtained their respective supercomputers based on those specifications. From 2010 to 2014, he was also the director of the Information Technology Center at The University of Tokyo. He led the design and implementation of HPCI, High Performance Computing Infrastructure in Japan, from 2010 to 2012.
Dr. Rolf Riesen is the lead software architect for the multi-operating system (mOS) project at the Intel Corp. The mOS team is creating an OS for use in supercomputers and other high-end HPC systems. Rolf has 25 years of experience in researching, developing, and deploying software for massively parallel processors. His career began as a key member of the Sandia National Laboratory and University of New Mexico team that created the lightweight kernel and the Portals message passing interface that broke the teraflops barrier in 1997 with the Intel-powered ASCI Red supercomputer. Over the years, Rolf's code and research ideas have directly contributed to specific systems on the TOP500 list, stretching over a period of almost 20 years. It began with SUNMOS on an nCUBE 2 to the Catamount OS on the Cray/Sandia Red Storm system. After teaching for 2 years at the University of New Mexico, he joined IBM research in Dublin, Ireland, where he focused on simulation and fault tolerance for extreme scale systems Now, at Intel, he is using his expertise to guide a team that combines a lightweight OS kernel with Linux. Rolf has over 50 peer-reviewed publications and is an active member of various program committees. He is also a subject area editor for the journal Parallel Computing.
Dr. Robert W. Wisniewski is an ACM Distinguished Scientist and the chief software architect for Extreme Scale Computing and a senior principal engineer at the Intel Corporation. He is the lead architect for Intel's cohesive and comprehensive software stack that leverages OpenHPC and is responsible for the software for Aurora, the world's largest announced supercomputer. He has published over 74 papers in the area of high performance computing, computer systems, and system performance, filed over 56 patents, and given over 53 external invited presentations. Before coming to Intel, he was the chief software architect for Blue Gene Research and manager of the Blue Gene and Exascale Research Software Team at the IBM T.J. Watson Research Facility. There, he was an IBM master inventor and led the software effort on Blue Gene/Q, the fastest machine in the world on the June 2012 TOP500 list, and occupied 4 of the top 10 positions.