The significant increase in the scale and complexity of networked systems, from online retail networks to computer networks, on one hand, and the progress in machine learning techniques that is supported by the rapid development of software and hardware components, on the other hand, creates a unique dynamic. There is a pull from the networked systems for automated and scalable methods to handle the challenges of management, scheduling, and monitoring of such complex systems while there is a push from the machine learning side to solve such problems.
Exploratory visualization and analysis of time-dependent vector fields or flow fields generated by large scientific simulations is increasingly challenging on modern supercomputers. Traditional time-dependent flow visualization is performed using an Eulerian representation of the vector field and requires both a high spatial and temporal resolution to be accurate.
We consider the problem of efficient particle advection in a distributed- memory parallel setting, focusing on four popular parallelization algorithms. The performance of each of these algorithms varies based on the desired workload. Our research focuses on two important questions: (1) which parallelization techniques perform best for a given workload?, and (2) what are the unsolved problems in parallel particle advection?
In situ visualization is increasingly necessary to address I/O limitations on supercomputers. However, in situ visualization can take on multiple forms. In this research we consider two popular forms: in-line and in-transit in situ. With the increasing heterogeneity of supercomputer design, efficient and cost effective use of resources is extremely difficult for in situ visualization routines. This is further compounded by in situ's multiple forms, and the unknown performance of various visualization algorithms performed in situ at large scale.
The advent of cloud computing as a means of offering virtualized computing and storage resources has radically transformed how modern enterprises run their business and has also fundamentally changed how today's large cloud providers operate. For example, as these large cloud providers offer an increasing number of ever-more bandwidth-hungry cloud services, they end up carrying a significant fraction of today's Internet traffic.
Distributed denial-of-service (DDoS) attacks continue to threaten the availability and integrity of critical Internet infrastructure upon which the society relies more heavily than ever before. The extremely high volume and distributed nature of modern DDoS attacks render traditional "edge-defense" solutions (either victim-side or attack-source-side) less effective.
Scientific and engineering applications are dominated by linear algebra and depend on scalable solutions of sparse linear systems. For large problems, preconditioned iterative methods are a popular choice. High-performance numerical libraries offer a variety of preconditioned Newton-Krylov methods for solving sparse problems. However, the selection of a well-performing Krylov method remains to be the user's responsibility.
Power consumption is widely regarded as one of the biggest challenges to reaching the next generation of high performance computing. On future supercomputers, power will be a limited resource. This constraint will affect the performance of both simulation and visualization workloads. Understanding how a particular application behaves under a power limit is critical to making better use of the limited power. In this research, we focus specifically on visualization and analysis applications, which are an important component in HPC.
Sorting and hashing are canonical index-based methods to perform searching, and are prevalent sub-routines in many scientific visualization and data analysis algorithms. With the emergence of highly-parallel, many-core architectures, these algorithms must be reformulated to exploit the increased available data-parallelism and instruction-level parallelism. Data-parallel primitives (DPP) provide an efficient way to design an algorithm for scalable, platform-portable parallelism. This dissertation proposes the design of platform-portable, index-based search techniques using DPP.