 |
Rice University
The Departments of Bioengineering and Computer Science
present
Jeremy Buhler
Dept. of Computer Science and Engineering
University of Washington, Seattle
Searching Biosequences Using Random Projection
Abstract
The last few years have seen explosive growth in the amount of
available genomic DNA sequence, including nearly complete sequencing
of the human and other genomes. This growth brings new urgency to
computational biology and bioinformatics: we must scale up the
computational tools we use to ANNOTATE raw genomic DNA, that is, to
search raw sequence for biologically meaningful features such as genes
and their regulatory sites. The same growth in sequence availability
threatens to overwhelm the working biologist, who must decide which
portions of the sequence are interesting enough to merit further study
in the lab. Hence, as the amount of sequence grows, annotation tools
must become MORE aggressive in discovering potentially interesting
sequence features amid the background noise; otherwise, important but
subtle features will be missed and will therefore be unable to compete
for the investigator's attention.
With these considerations in mind, Buhler has developed new, more
aggressive algorithms in the area of comparative annotation -- finding
features that are approximately repeated multiple times within one or
more genomic sequences. These algorithms exploit a randomized
technique -- random projection -- for efficiently finding similar sets
of substrings in a long string. Using random projection, he devises
annotation algorithms that run in reasonable time yet do not share the
bias, present in many commonly-used annotation methods today, against
finding certain types of sequence features.
Specific algorithms covered in this talk will include the
LSH-ALL-PAIRS algorithm for discovering similar subregions of long
sequences WITHOUT requiring that these regions contain long exact
word matches), as well as the PROJECTION algorithm for finding subtle
regulatory motifs that are intractable for existing search algorithms.
Monday, April 9, 2001 at 4:00 in Keck 102
A reception will be held BEFORE the talk at 3:30 outside of Keck 102.
About Jeremy Buhler
Jeremy Buhler was an undergraduate in Computer Science at Rice from 1992 to
1996, during which time Alex Schaffer (now of NIH) encouraged his
interest in bioinformatics research through a collaborative project with Baylor
College of Medicine. He then entered the Ph.D. program in the
Computer Science Department at the University of Washington in
Seattle, where he was first advised by Dick Karp and later by
Martin Tompa. In his research, he has collaborated with UW's
Department of Molecular Biotechnology and, more recently, with the
independent Institute for Systems Biology.
His current interests within bioinformatics include genomic sequence
annotation and inferring models of biological regulatory networks. He
devotes much of his non-academic energy to his cat, Figaro.
Jeremy Buhler is a faculty candidate.
--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- |
|
| |