# Directed Research Project

## Evaluating the Efficacy of Wavelet Compression for Turbulent-Flow Data

We explore the ramifications of using wavelet compression on turbulent-flow data from scientific simulations. As upcoming I/O constraints may significantly hamper the ability of scientific simulations to write full-resolution data to disk, we feel this study enhances the the understanding of exascale scientists with respect to potentially applying wavelets in situ. Our approach repeats existing analyses with wavelet-compressed data, using evaluations that are quantitatively based. The data sets we select are large, including one with a 4,096 cubed grid.

## BGPInspector: A Real-time Extensible Border Gateway Protocol Monitoring Framework

The Internet often experiences disruptions that affect its overall performance. Disruptive events includes global-scale incidences such as large-scale power outages, undersea cable cuts, or Internet worms. They also include IP-prefix level anomalies such as prefix hijacking or route leak events. All such events could cause the Internet to deviate from its normal state of operation. It is therefore important to monitor and detect the abnormal events, and do so from both granularities. Current solutions mostly focus on detecting certain types of events or anomalies and ignore the others.

## Programming in Second Order Classical Sequent Calculus

Two important lines of work take advantage of the correspondence between logic and programming languages. Languages based on classical sequent calculus are used to study control effects, evaluations strategies, and duality. At time same time, higher order logic is used to model generic programming and abstraction. We combine these two approaches. The calculus $\mu \tilde \mu_S^2$ is defined with a type system corresponding to second order classical sequent calculus for which we prove type safety and strong normalization for multiple evaluation strategies.

## Examining the Automated Inference of Tweet Topics

The increasing volume of information exchange over online social networks (e.g. Twitter, Facebook) has led to the growing interest in technique for automated inference of the topic of individual posts/tweets in recent years. Short length, lack of a well defined set of topics, and use of acronyms in tweets are some of the reasons that make topic inference of tweets challenging.

## Identifying Optimization Opportunities within Kernel Execution in GPU Architectures

Tuning codes for GPGPU architectures is challenging because few performance tools can pinpoint the exact causes of execution bottlenecks. While profiling applications can reveal execution behavior with a particular architecture, the abundance of collected information can also overwhelm the user. Moreover, performance counters provide cumulative values but does not attribute events to code regions, which makes identifying performance hot spots difficult.

## A Visualization Pipeline for Large-Scale Tractography Data

We present a novel methodology for clustering and visualizing large-scale tractography data sets. Tractography data sets are very large, containing up to hundreds of millions of tracts; making visualizing and understanding this data very difficult. Our method reduces and simplifies this data to create coherent groupings and visualizations. Our input is a collection of tracts, from which we derive metrics and perform a k-means++ clustering.

## Exploring Tradeoffs Between Power and Performance for a Scientific Visualization Algorithm

Power is becoming a major design constraint in the world of high-performance computing (HPC). This constraint affects the hardware being considered for future architectures, the ways it will run software, and the design of the software itself. Within this context, we explore tradeoffs between power and performance. Visualization algorithms themselves merit special consideration, since they are more data-intensive in nature than traditional HPC programs like simulation codes. This data-intensive property enables different approaches for optimizing power usage.

## External Facelist Calculation with Data Parallel Primitives

External facelist calculation on three-dimensional unstructured meshes is used in scientific visualization libraries to efficiently render the results of operations such as clipping, interval volumes, and material boundaries. With this study, we consider the external facelist algorithm on many-core architectures. We introduce four different approaches: three based on hashing and one based on sorting. Each of the algorithms consists entirely of data-parallel primitive operations, in an effort to achieve portable performance across different architectures.

## Automated Selection of Numerical Solvers

Many complex problems rely on scientific and engineering computing for solutions. High-performance computing depends heavily on linear algebra for large scale data analysis, modeling and simulation, and other applied problems. Linear algebra provides the building blocks for a wide variety of scientific and engineering simulation codes. Sparse linear system solution often dominates the execution time of such applications, prompting the ongoing development of highly optimized iterative algorithms and high- performance parallel implementations.

## Avatar Idealization in Video Games

Video games give players the freedom to become another character with a different physical appearance and personality when in a virtual world. Do gamers choose a character they find similar to themselves or their ideal selves? Do identity and personality factors like gender and self-esteem affect this choice? We surveyed 570 players of the popular first-person shooter game Team Fortress 2 on demographics and game preferences, and asked them to describe their actual self, their ideal self, and their chosen avatar with a set list of adjectives.