EECS Joint Colloquium Distinguished Lecture Series

Wednesday, October 15, 2003
Hewlett Packard Auditorium, 306 Soda Hall
4:00-5:00 p.m.

Professor Gene Myers

Electrical Engineering and Computer Sciences Dept.,
UC Berkeley


Efficient Algorithms for Comparing Genomes




We anticipate having the euchromatic portions of the genomes of twelve species of Drosophila in the next two years. A comparative genomics agenda begins with the alignment of these genomes. We present a new algorithm for rapidly finding all locally aligned segments that have a submatch above a given level of identity. We then give an algorithm for combining these local alignments into a global alignment based on an analysis of which parts of the genome are repetitive. Finally, we refine the global alignment by showing how to run the classic dynamic programming algorithm in an irregular shaped “zone” using only linear space.

We conclude with the shortcomings of such an “identity” based approach, introduce the idea of “biologically” informed alignments, and sketch possible strategies.


Gene Myers joined the faculty of Computer Science at the University of California, Berkeley at the start of 2003. He was formerly Vice President of Informatics Research at Celera Genomics for four years where he and his team determined the sequences of the Drosophila, Human, and Mouse genomes using the whole genome shotgun technique that he advocated in 1996. Prior to that Gene was on the faculty of the University of Arizona for 18 years and he received his Ph.D in Computer Science from the University of Colorado in 1981. His research interests include design of algorithms, pattern matching, computer graphics, and computational molecular biology. His most recent academic work has focused on algorithms for the central combinatorial problems involved in DNA sequencing, and on a wide range of sequence and pattern comparison problems. Among the tools he has developed are Blast -- a widely used tool for protein similarity searches, FAKtory -- a system to support DNA sequencing projects, Anrep -- a pattern matching language for applications in molecular biology, and Mac- & PC-Molecule -- a molecular visualization tool for Apple and Wintel computers. He was awarded the IEEE 3rd Millenium Acheivement Award in 2000, the Newcomb Cleveland Best Paper in Science award in 2001, and the ACM Kanellakis Prize in 2002. He was voted the most influential in bioinformatics in 2001 by Genome Technology Magazine and was elected to the National Academy of Engineering in 2003.