Scalable Observation, Analysis, and Tuning for Parallel Portability in HPC

Chad Wood
Date and time: 
Thu, Dec 9 2021 - 3:00pm
Chad Wood
University of Oregon
  • Allen Malony, Chair
  • Hank Childs
  • Boyana Norris
  • Stephanie Majewski, Physics

It is desirable for general productivity that high-performance computing applications be portable to new architectures, or can be optimized for new workflows and input types, without the need for costly code interventions or algorithmic re-writes. Parallel portability programming models provide the potential for high performance and productivity, however they come with a multitude of runtime parameters that can have significant impact on execution performance. Selecting the optimal set of those parameters is non-trivial, so that HPC applications perform well in different system environments and on different input data sets.

This dissertation maps out a vision for addressing this parallel portability challenge, and then demonstrates this plan through an effective combination of observability, analysis, and in situ machine learning techniques. A platform for general-purpose observation in HPC contexts will be investigated, along with support for its use in human-in-the-loop performance understanding and analysis, finally concluding in a demonstration of lessons learned in order to provide online automated tuning of HPC applications utilizing parallel portability frameworks.