Rice Computer Science: <title>Rice Computer Science-Colloquia
[RiceCS]
DEPARTMENT
RESEARCHACADEMICS
PEOPLENEWS
[Rice]
Rice Computer Science
  SEARCH:
  
Rice University
Department of Computer Science
presents

ALIN DOBRA
Cornell University

Processing Aggregate Queries over Continuous Data Streams

Abstract

Data management techniques for data streams has gained much importance recently. The talk addresses techniques for approximately answering aggregate SQL queries over continuous data streams with limited memory. Our method relies on randomizing techniques that compute small "sketches" of the streams that can be used to provide approximate query answers with provable guarantees on the approximation error.

In this talk we give an introduction to sketches and we present our two main contributions. First, we show how existing statistical information about the data (e.g., histograms) can be used to improve the quality of the approximations provided by our algorithms. The key idea is to intelligently partition the domain of the join attributes in a way that provably improves the approximation guarantees. Second, in the presence of multiple queries, intelligent sharing of sketches among concurrent queries can result in dramatic improvements in the utilization of the available sketching space and in the quality of the resulting approximation error. We will discuss some of the novel optimization problems that arise in the context of multi-query sketch sharing and describe our solutions.

(Joint work with Johannes Gehrke, Minos Garofalakis and Rajeev Rastogi)

Alin Dobra is a faculty candidate.

Thursday, April 10 at 4:00 p.m. in DH 1070
Reception preceding the talk at 3:30 p.m. in DH 3076

--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---