Describing useful current research in modern performance science and engineering, this book helps real-world users of parallel computer systems to better understand both the performance vagaries arising in scientific applications and the practical means for improving performance. Some of the most notable experts in the field focus on the areas of performance monitoring, performance analysis, performance modeling, automatic performance tuning, and application tuning. The book also provides an overview of modern computer architecture. It includes examples from such areas as solid mechanics, astrophysics, quantum chromodynamics, molecular dynamics, and environmental science.
David Bailey is a chief technologist in the High Performance Computational Research Department at the Lawrence Berkeley National Laboratory. Dr. Bailey has published several books and numerous research studies on computational and experimental mathematics. He has been a recipient of the ACM Gordon Bell Prize, the IEEE Sidney Fernbach Award, and the MAA Chauvenet Prize and Merten Hasse Prize.
Robert Lucas is the director of computational sciences in the Information Sciences Institute and a research associate professor in computer science in the Viterbi School of Engineering at the University of Southern California. Dr. Lucas has many years of experience working with high-end defense, national intelligence, and energy applications and simulations. His linear solvers are the computational kernels of electrical and mechanical CAD tools.
Samuel Williams is a researcher in the Future Technologies Group at the Lawrence Berkeley National Laboratory. Dr. Williams has authored or co-authored thirty technical papers, including several award-winning papers. His research interests include high-performance computing, auto-tuning, computer architecture, performance modeling, and VLSI.
Introduction. Parallel Computer Architecture. Software Interfaces to Hardware Counters. Measurement and Analysis of Parallel Program Performance using TAU and HPCToolkit. Trace-Based Tools. Large-Scale Numerical Simulations on High-End Computational Platforms. Performance Modeling: The Convolution Approach. Analytic Modeling for Memory Access Patterns Based on Apex-MAP. The Roofline Model. End-to-End Auto-Tuning with Active Harmony. Languages and Compilers for Auto-Tuning. Empirical Performance Tuning of Dense Linear Algebra Software. Auto-Tuning Memory-Intensive Kernels for Multicore. Flexible Tools Supporting a Scalable First-Principles MD Code. The Community Climate System Model. Tuning an Electronic Structure Code. Bibliography. Index.