| Keynote
Speaker
- Dr.
David Haussler, Howard Hughes Medical Centre
|
Identifying functional elements in the human genome
by tracing the evolutionary history of the bases:
a key challenge for comparative genomics |
| Additional
contributors: Mathieu Blanchette, Adam Siepel, Krishna
Roskin, Ryan Weber, Mark Diekhans, Francesca Chiaromonte,
Ross Hardison, Jim Kent, Eric Green, and Webb Miller |
| |
A
statistical estimate based on a simple measure of
conservation between short orthologous segments
in the human and mouse genomes suggests that about
5% of the human genome may be under purifying selection,
but does not definitively tell us which segments
are under selection. To discover these, we will
soon be able to compare the entire reference human
genome to many other reference vertebrate genomes
(and perhaps to even more distantly related genomes).
The bioinformatic grand challenge here is to use
this data to trace back as far as possible the evolutionary
history of each base in the reference human genome.
Then, by analyzing the detailed patterns of molecular
evolution of segments of the genome over a long
period, we might hope to recognize the signatures
of purifying selection in different kinds of functional
elements, and also detect instances of positive
selection leading to new functions. We are trying
to address this grand challenge through the development
of refined sequence alignment methods and new models
for molecular evolution. These models include context
dependencies between substitutions at adjacent bases,
e.g. as in the process of CpG decay, and that allow
for different rates of evolution in different parts
of the chromosomes, as has been confirmed in recent
analysis. Gene-finding hidden Markov models for
DNA have been extended to process multiple alignments
instead of single sequences, and to include context-dependent
models of evolution. Finally, a method of assigning
p-values to conserved elements based on models of
molecular evolution has been developed and applied
to predict functional elements from a multiple alignment
of the NISC Comparative Sequencing Program data
in the human CFTR region. While these methods are
still in early development and the results from
them preliminary, we are encouraged about the potential
of this line of research, if a bit daunted by the
complexity of the problem.
|
| |
|