- Allen Malony (Chair)
- Boyana Norris
- Hank Childs
- Douglas Toomey (Earth Sciences)
- Seyong Lee (Oak Ridge National Laboratory)
From the advent of the message-passing architecture in the early 1980's to the recent dominance of accelerator-based heterogeneous architectures, high performance computing (HPC) hardware has gone through a series of changes. At the same time, HPC runtime systems have also been adapted to harness this growth in computational capabilities. Specifically, modern HPC runtime systems have transformed into active entities capable of making dynamic decisions during the execution of an application. These dynamic decisions can improve performance, reduce energy consumption, and increase the overall utilization of the underlying HPC hardware. However, a runtime system needs insight into the application and the underlying hardware to make efficient decisions. This dissertation identifies that information gained from modeling the memory architecture is critical to enable efficient decision-making within the runtime system. After outlining the research problems associated with dynamic adaptation in HPC runtimes, different modeling approaches are explored to gather insight into the memory architecture of modern HPC hardware.
This research takes the form of four major projects: (1) application and machine agnostic approaches to dynamically adapt the HPX runtime system, (2) modeling memory contention in a heterogeneous system where processors share the same memory, (3) understanding and modeling the handshake between memory access pattern and modern cache hierarchy to statically predict the memory transactions between the last level cache (LLC) and system memory of modern Intel processors, and (4) an exploration of similarities and dissimilarities between Intel CPUs and NVIDIA and AMD GPUs to pave the way to model LLC-device memory transactions in GPUs.