Directed Research Project

Learning Electronic Health Records through HyperbolicEmbedding of Medical Ontologies

Unplanned intensive care units (ICU) readmissions and in-hospital mortality of patients are two important metrics for evaluating the quality of hospital care. Identifying patients with higher risk of readmission to ICU or of mortality can not only protect those patients from potential dangers, but also reduce the high costs of healthcare. In this work, we propose a new method to incorporate information from the Electronic Health Records (EHRs) of patients and utilize hyperbolic embeddings of a medical ontology (i.e., ICD-9) in the prediction model.

Cedar: A Reconfigurable Data Plane Telemetry System

Modern network telemetry systems rely on programmable switches to perform the required operations within the data plane in order to scale with the rate of network traffic. These systems create a stream processing pipeline for all telemetry operations and statically map a subset of operations to switch resources. There are two inherent restrictions in this approach: first, the fraction of operations in the data plane decreases with the number and complexity of telemetry tasks.

Scheduling DDoS Cloud Scrubbing in ISP Networks via Randomized Online Auctions

While both Internet Service Providers (ISPs) and third-party Security Service Providers (SSPs) offer Distributed Denial-of-Service (DDoS) mitigation services through cloud-based scrubbing centers, it is often beneficial for ISPs to outsource part of the traffic scrubbing to SSPs to achieve less economic cost and better network performance. To explore this potential, we design an online auction mechanism, featured by the challenge of the switching cost of using different winning bids over time.

Exploiting the Matching Information in the Support Set for Few Shot Event Classification

The existing event classification (EC) work primarily focuses on the traditional supervised learning setting in which models are unable to extract event mentions of new/unseen event types. Few-shot learning has not been investigated in this area although it enables EC models to extend their operation to unobserved event types. To fill in this gap, in this work, we investigate event classification under the few-shot learning setting. We propose a novel training method for this problem that extensively exploit the support set during the training process of a few-shot learning model.

Improving Cross-Domain Performance for Relation Extraction via Dependency Prediction and Information Flow Control

Relation Extraction (RE) is one of the fundamental tasks in Information Extraction and Natural Language Processing. Dependency trees have been shown to be a very useful source of information for this task. The current deep learning models for relation extraction have mainly exploited this dependency information by guiding their computation along the structures of the dependency trees. One potential problem with this approach is it might prevent the models from capturing important context information beyond syntactic structures and cause the poor cross-domain generalization.

Towards Intelligent Defense against Application-Layer DDoS with Reinforcement Learning

Application-layer distributed denial-of-service (L7 DDoS) attacks, by exploiting application-layer requests to overwhelm functions or components of victim servers, has become a major rising threat to today’s Internet. However, because the traffic from an L7 DDoS attack appears totally legitimate in transport and network layers, it is difficult to detect and defend against an L7 DDoS attack with traditional DDoS solutions. 

Exploiting Domain Structure with Hybrid Generative-Discriminative Models

Machine learning methods often face a tradeoff between the accuracy of discriminative models and the lower sample complexity of their generative counterparts. This inspires a need for hybrid methods. We present the graphical ensemble classifier (GEC), a novel combination of logistic regression and naive Bayes. By partitioning the feature space based on known independence structure, GEC is able to handle datasets with a diverse set of features and achieve higher accuracy than a purely discriminative model from less training data.

Exploring Codata: The Relation to Object-Orientation

Functional languages are known to enjoy an elegant connection to logic: lambda-calculus corresponds to natural deduction. Unfortunately, the same cannot be said for object-oriented languages. Type systems have been designed to capture all the fancy features present in current object-oriented languages. We believe, however, that the logical foundation of object-orientation has not yet been fully explored. Our goal is to describe how objects arise naturally in logic.

Efficient Point Merging Using Data Parallel Techniques

We study the problem of merging three-dimensional points that are nearby or coincident.

We introduce a fast, efficient approach that uses data parallel techniques for execution in various shared-memory environments. Our technique incorporates a heuristic for efficiently clustering spatially close points together, which is one reason our method performs well against other methods.

Understanding the Impact of Dynamic Power Capping on Application Progress

Electrical power has become an important design constraint in high-performance computing (HPC) systems. On future HPC machines, power is likely to be a budgeted resource and thus managed dynamically. Power management software needs to reliably measure application performance at runtime in order to respond effectively to changes in application behavior. Execution time tells us little about how the science in the application is progressing toward an application-defined end goal.


Subscribe to RSS - Directed Research Project