Recent Publications

See Google Scholar for a complete publication list.

An FPGA Accelerator for Genome Variant Calling

In genome analysis, it is often important to identify variants from a reference genome. However, identifying variants that occur with low frequency can be challenging, as it is computationally intensive to do so accurately. LoFreq is a widely used program that is adept at identifying low frequency variants. This paper presents an FPGA-based accelerator for LoFreq. In particular, this accelerator is targeted at virus analysis, which is particularly challenging, compared to human genome analysis, as the characteristics of the data to be analyzed are fundamentally different. This accelerator can achieve up to 120X speedups on the core computation of LoFreq and speedups of up to 32.4X across the entire program.

Understanding Transparent Superpage Management

Superpages (2 MB pages) can reduce the address translation overhead for large-memory workloads in modern computer systems. We clearly outline the sequence of events in the life of a superpage and explore the design space of when and how to trigger and respond to those events. We provide a framework that enables better understanding of superpage management and the trade-offs involved in different design decisions. Quicksilver, our novel superpage management system, is designed based on the insights obtained by using this framework to improve superpage management.

Virtflex: Automatic Adaptation to NUMA Topology Change for OpenMP Applications

Advances in PCI-Express and optical interconnects are making rack-scale computers possible, but these computers will undoubtedly exhibit Non-Uniform Memory Access (NUMA) latencies. Ideally, a hypervisor for rack-scale computers should be able to dynamically reconfigure a virtual machine’s processing and memory resources, i.e., its NUMA topology, to satisfy each application’s evolving demands. Unfortunately, current hypervisors lack support for such dynamic reconfiguration. To that end, this paper introduces Virtflex, a multilayered system for enabling unmodified OpenMP applications to adapt automatically to NUMA topology changes. Virtflex provides a novel NUMA page placement reset mechanism within the guest OS and a novel NUMA-aware superpage ballooning mechanism that spans the guest OS-hypervisor boundary. The evaluation shows that Virtflex enables applications to adapt efficiently to NUMA topology changes. For example, adding resources incurs an average runtime overhead of only 7.27%.

A Comprehensive Analysis of Superpage Management Mechanisms and Policies

Superpages (2MB pages) can reduce the address translation overhead for large-memory workloads in modern computer systems. This paper clearly outlines the sequence of events in the life of a superpage and explores the design space of when and how to trigger and respond to those events. This provides a framework that enables better understanding of superpage management and the trade-offs involved in different design decisions. Under this framework, this paper discusses why state-of-the-art designs exhibit different performance characteristics in terms of runtime, latency and memory consumption. This paper illuminates the root causes of latency spikes and memory bloat and introduces Quicksilver, a novel superpage management design that addresses these issues while maintaining address translation performance.

Compigorithm: An Interactive Tool for Guided Practice of Complexity Analysis

It is essential that students learn to write code that is not only correct, but also efficient. To that end, algorithmic complexity analysis techniques, such as Big-O analysis, are typically an important part of courses on algorithm design. However, students often hold fundamental misconceptions about how Big-O analysis works. This paper presents Compigorithm, an interactive tool for helping students practice Big-O analysis. Compigorithm scaffolds student learning by breaking down the analysis process into five concrete steps and walking students through each of these steps. When students make mistakes, they are provided with automated hints and allowed to re-attempt until they get the correct answer. Compigorithm was piloted in an introductory algorithms course and evaluated using a controlled experiment. The experimental group trained by analyzing algorithms using Compigorithm, while the control group analyzed the same algorithms by hand. On the subsequent post-test, the experimental group outperformed the control group by a significant margin (p < 0.00001; Cohen’s d = 0.84).

Theses