The quantitative analysis
of biological sequence data is based on methods from statistics coupled
with efficient algorithms from computer science. The mathematical field
of algebra provides a framework for unifying many of the seemingly disparate
techniques used by computational biologists. This book offers an introduction
to this mathematical framework and describes tools from computational
algebra that can be used to design new algorithms for exact, accurate
results. These can be applied to biological problems such as aligning
genomes, finding genes and constructing phylogenies.
The first part of this book
consists of four chapters on the themes of Statistics, Computation,
Algebra and Biology. These chapters offer a speedy self-contained introduction
to the emerging field of algebraic statistics and its applications to
genomics. Specific topics that are discussed include graphical models,
Gröbner bases, maximum likelihood estimation, convex polytopes,
phylogenetic combinatorics, tropical geometry and DNA sequence analysis.
In the second part the four
themes are combined and developed to tackle real problems in computational
genomics. Written by participants in a graduate seminar at Berkeley,
it consists of 18 chapters which offer in-depth case studies at the
very forefront of research in this area. Topics include parametric inference
(with applications to sequence alignment), Markov chains and hidden
Markov models (with emphasis on the EM algorithm), and new methods for
phylogeny and comparative genomics. Also included are descriptions of
software with examples.
As the first book in this
exciting and dynamic area, it will be welcomed as a text for self-study
or for advanced undergraduate and beginning graduate courses.