A Threshold-Based-Analysis of the Code Quality of High-Performance Computing Software

Bosco Ndemeye
Date and time: 
Fri, Dec 3 2021 - 2:00pm
Location: 
Zoom
Speaker(s):
Bosco Ndemeye
University of Oregon
Host/Committee: 
  • Boyana Norris (Chair)
  • Michal Young
  • Stephen Fickas

Many popular metrics used for the quantification of the quality or complexity of a codebase (e.g. cyclomatic complexity) were developed in the 1970s or 1980s when source code sizes were significantly smaller than they are today, and before a number of modern programming language features were introduced in different languages. Thus, the many thresholds that were suggested by researchers for deciding whether a given function is lacking in a given quality dimension need to be updated. In the pursuit of this goal, we study a number of open-source high-performance codes, each of which has been in development for more than 15 years—a characteristic which we take to imply good design to score them in terms of their source codes’ quality and to relax the above-mentioned thresholds. First, we employ the LLVM/Clang compiler infrastructure and introduce a Clang AST tool to gather AST-based metrics, as well as an LLVM IR pass for those based on a source code’s static call graph. Second, we perform statistical analysis to identify the reference thresholds of 22 code quality and callgraph-related metrics at a fine-grained level.