Over the last two centuries of human thought, the discovery of biological evolution profoundly changed -- and continues to change -- our perception of the living world. Evolutionary history can be difficult to study because evolution operates on timescales that are much longer than the length of a human lifetime. The techniques of computational phylogenetic ancestral reconstruction tackle this problem by inferring evolutionary history from contemporary genomes, using probabilistic Markov models of molecular sequence evolution.
The design of successful assistive technologies requires careful personalization for the individual user, as well as rapid, low cost cycles for product development and testing. My research brings two modern software engineering models to meet these challenges: Personal and Contextual Requirements Engineering (PC-RE) and Agile Software Development. We adapt these models to the domain of assistive mobile navigation for the blind.
The Internet has become a core component of our lives and businesses. It's reliability and availability are of paramount importance. There are many types of malware that impact the availability of the Internet, including network worms, bot-nets, viruses, etc. Detecting such attacks is a critical component of defending against them. This dissertation focuses on detecting and understanding self-propagating network worms, a type of malware with a proven record of disrupting the Internet.
The Internet has become an indispensable resource for today's society. It is at the center of the today's business, entertainment, and social world. However, the core of our identities on the Internet, the IP addresses that are used to send and receive data throughout the Internet, are insecure. Attackers today are able to send data purporting to be from nearly any location (IP spoofing), and to reroute data destined for victims to the attackers themselves (IP prefix hijacking). Victims of these attacks may experience denial of service, mis-placed blame, and theft of their traffic. These at
Programmers design, write, and understand programs with a high-level structure in mind. Existing programming languages are not very good at capturing this structure because they must include low-level implementation details. To address this problem we introduce Twig, a programming language that allows for domain-specific logic to be encoded alongside low-level functionality. Twig's language is based on a simple, formal calculus that is amenable to both human and machine reasoning. Users may introduce rules that rewrite expressions, allowing for user-defined optimizations.
Data mining, also referred to as knowledge discovery in databases (KDD), is the non-trivial extraction of implicit, previously unknown, and potentially useful information from data. The measure of what is meant by ``useful" to the user is dependent on the user as well as the domain within which the data mining system is being used. Therefore, the role of domain knowledge in the discovery process is essential. However, previous research has made limited attempts to build data mining systems that are capable of incorporating domain knowledge in a principled manner.
Compilation encompasses many steps. Parsing turns the input program into a more manageable syntax tree. Verification ensures that the program makes some semblance of sense. Finally, code generation transforms the internal abstract program representation into an executable program. Compilers strive to produce the best possible programs and optimizations are applied at nearly every level of compilation.
Designs of human-computer systems intended for time-critical multitasking can benefit from an understanding of the human factors that support or limit multitasking performance, and a detailed account of the human-machine interactions that unfold in a given task environment. An integrated, computational cognitive model can test and provide such an understanding of the human factors related to multitasking, and reveal the dynamic interactions that occur in the task at the level of hundreds of milliseconds.
Information Extraction is the process of automatically transforming written natural language (i.e., text) into structure information, such as a knowledge base. However, because natural language is inherently ambiguous, this transformation process is highly complex. On the other hand, as Information Extraction moves from the analysis of scientific documents to Internet textual content, we cannot rely completely on the assumption that the content of the text is correct.
Machine learning and data mining have provided plenty of tools for extracting knowledge from data. Yet, such knowledge may not be directly applicable to target applications and might need further manipulation: The knowledge might contain too much noise, or the target application may use a different representation or terminology.