Events & News
Colloquium
Category | Lecture |
Date | Tuesday, August 21, 2012 |
Time | 11:00 am |
Concludes | 12:00 pm |
Location | Gould-Simpson 906 |
Speaker | Titus Brown |
Title | Assistant Professor |
Affiliation | Michigan State University |
Streaming Glossy Compression of Biological Sequence Data Using Probabilistic Data Structures
In recent years, next-generation DNA sequencing capacity has completely outstripped our ability to computationally digest the resulting volume of data. Driven by the need to actually analyze the data, our lab has developed a suite of novel data structures and algorithms for graph compression and data reduction; in addition to being darned efficient on their own, our approaches make use of probabilistic data structures that enable substantially lower memory usage than the best possible exact approach. Using these approaches we have been able to scale de novo data assembly approaches down to cloud computing infrastructure, and we have also completed some of the largest de novo assemblies of metagenomes ever done. Last but not least, these approaches show the way to essentially infinite de novo assembly of environmental microbial data.
Biography
Trained by physicists, with a BA in pure math, a PhD in molecular developmental biology, and lots of open source code to my name, I am currently a biologist trapped in a computer science department. I work at the intersection of big sequence data, novel computer science data structures and algorithms, and biological hypothesis generation & validation.
Blog: http://ivory.idyll.org/blog/
Lab web site: http://ged.msu.edu/.