Extreme Scaling and Performance across Diverse Architectures with Salman Habib and Rajeev Thakur

This TechTalk will provide an introduction to basic issues in performance and scalability for large-scale applications on parallel supercomputers.

Because future supercomputing architectures–leading up to exascale–will follow diverse paths, it is important to articulate principles for designing applications that can run at extreme scale and high performance on a variety of systems. It is unlikely that, in the near term, automated solutions can provide more than incremental improvements, although this remains an important research direction. Given this situation, it is important, when developing new applications, to be cognizant of future hardware possibilities and to follow a design approach that allows for rapid deployment and optimization, including the use of multiple algorithms targeted to different architectures. This approach will be presented using a concrete example, the Hardware/Hybrid Accelerated Cosmology Code (HACC) framework. HACC runs efficiently on all current supercomputer architectures and scales to the largest of today’s systems.


Why code ‘ports’ are difficult across architectures, if performance is a major criterion

Differences in future architectures and what they mean for code design and optimization

The key role played by understanding of the domain science as well as the algorithmic implementation

The importance of layering the code to allow for easy implementation across architectures

Salman Habib and Rajeev Thakur

Salman Habib, Argonne National Laboratory and the University of Chicago

Salman Habib holds a joint appointment in the High Energy Physics and Mathematics and Computer Science Divisions at Argonne National Laboratory, and is a senior member of the Kavli Institute for Cosmological Physics and senior fellow of the Cosmological Institute at the University of Chicago. He received his Ph.D. in physics from the University of Maryland in 1988. After 20 years at Los Alamos National Laboratory, he moved to Argonne in 2011. His research has covered a wide range of interests, going from large to small scales in order–cosmology, astrophysics, accelerator physics, condensed matter physics, atomic and quantum optics, and particle physics. For the last two and a half decades Habib has been very interested in the intelligent application of parallel supercomputers to attacking physics problems, leading to algorithm and code development in a variety of fields and on a variety of platforms. Habib is a member of the APS, AMS, IEEE/CS, and the ACM.

Rajeev Thakur, Argonne National Laboratory; SIGHPC

Rajeev Thakur is the Deputy Director of the Mathematics and Computer Science Division at Argonne National Laboratory, where he is also a Senior Computer Scientist. He is also a Senior Fellow in the Computation Institute at the University of Chicago and an Adjunct Professor in the Department of Electrical Engineering and Computer Science at Northwestern University. He received a Ph.D. in Computer Engineering from Syracuse University. His research interests are in the area of high-performance computing in general and particularly in parallel programming models, runtime systems, communication libraries, and scalable parallel I/O. He is a co-author of the book Using MPI-2 Advanced Features of the Message Passing Interface, which has also been translated into Japanese. He is also a co-author of the upcoming book Using Advanced MPI: Modern Features of the Message-Passing Interface to be published by MIT Press. He serves on the Steering Committee of ACM SIGHPC as an elected Member-At-Large and was Technical Program Chair of the SC12 conference.