We consider the problem of efficient particle advection in a distributed- memory parallel setting, focusing on four popular parallelization algorithms. The performance of each of these algorithms varies based on the desired workload. Our research focuses on two important questions: (1) which parallelization techniques perform best for a given workload?, and (2) what are the unsolved problems in parallel particle advection?
In situ visualization is increasingly necessary to address I/O limitations on supercomputers. However, in situ visualization can take on multiple forms. In this research we consider two popular forms: in-line and in-transit in situ. With the increasing heterogeneity of supercomputer design, efficient and cost effective use of resources is extremely difficult for in situ visualization routines. This is further compounded by in situ's multiple forms, and the unknown performance of various visualization algorithms performed in situ at large scale.
The advent of cloud computing as a means of offering virtualized computing and storage resources has radically transformed how modern enterprises run their business and has also fundamentally changed how today's large cloud providers operate. For example, as these large cloud providers offer an increasing number of ever-more bandwidth-hungry cloud services, they end up carrying a significant fraction of today's Internet traffic.
Distributed denial-of-service (DDoS) attacks continue to threaten the availability and integrity of critical Internet infrastructure upon which the society relies more heavily than ever before. The extremely high volume and distributed nature of modern DDoS attacks render traditional "edge-defense" solutions (either victim-side or attack-source-side) less effective.
Scientific and engineering applications are dominated by linear algebra and depend on scalable solutions of sparse linear systems. For large problems, preconditioned iterative methods are a popular choice. High-performance numerical libraries offer a variety of preconditioned Newton-Krylov methods for solving sparse problems. However, the selection of a well-performing Krylov method remains to be the user's responsibility.
Power consumption is widely regarded as one of the biggest challenges to reaching the next generation of high performance computing. On future supercomputers, power will be a limited resource. This constraint will affect the performance of both simulation and visualization workloads. Understanding how a particular application behaves under a power limit is critical to making better use of the limited power. In this research, we focus specifically on visualization and analysis applications, which are an important component in HPC.
Sorting and hashing are canonical index-based methods to perform searching, and are prevalent sub-routines in many scientific visualization and data analysis algorithms. With the emergence of highly-parallel, many-core architectures, these algorithms must be reformulated to exploit the increased available data-parallelism and instruction-level parallelism. Data-parallel primitives (DPP) provide an efficient way to design an algorithm for scalable, platform-portable parallelism. This dissertation proposes the design of platform-portable, index-based search techniques using DPP.
In the past few years, evaluating on adversarial examples has become a standard procedure to measure robustness of deep learning models. Literature on adversarial examples for neural nets has largely focused on image data, which are represented as points in continuous space. However, a vast proportion of machine learning models operate on discrete input, and thus demand a similar rigor in understanding their vulnerabilities and robustness. We study robustness of neural network architectures for textual and graph inputs, through the lens of adversarial input perturbations.
This thesis makes two major contributions: it introduces a novel method for analysis of artificial neural networks and provides new models of the nematode Caenorhabditis elegans nervous system. The analysis method extracts neural network motifs, or subnetworks of recurring neuronal function, from optimized neural networks. The method first creates models for each neuron relating network stimulus to neuronal response, then clusters the model parameters, and finally combines the neurons into multi-neuron motifs based on their cluster category.