Computational Analysis of High-Throughput Genomics Data

EECS Joint Colloquium Distinguished Lecture Series

Prof. David Haussler
CS Dept, U.C. Santa Cruz

Wednesday, November 10, 1999
Hewlett Packard Auditorium, 306 Soda Hall
4:00-5:00 p.m.


The complete genomic sequence for several key model organisms is now available, and the human genome sequence will be nearly finished by next summer. A new interdisciplinary field of bioinformatics has emerged, focusing on building the tools we need to extract the vital information locked in the datasets generated by DNA sequencing, gene expression measurements from "gene chips", and other kinds of high-throughput genomics technology. Early results from the analysis of this genomics data are greatly accelerating research in basic molecular biology, and having a powerful impact on drug discovery, clinical diagnostics, and many other applied areas. Two statistical techniques that have recently been used in the analysis of genomics data are Hidden Markov Models (HMMs) and Support Vector Machines (SVMs). We will discuss how these methods are used, and present a new way of exploiting HMMs and SVMs in combination that shows significant promise.