biosketch:
Ewan Birney trained as a biochemist at Oxford
University, and did a Ph.D. in gene prediction
with Richard Durbin at the Wellcome Trust Sanger
Institute. He moved to the EBI in 2000 to coordinate
the EBI’s contribution to Ensembl, a joint
project with the Sanger Institute to provide a
comprehensive, automatically generated annotation
for the genomes of higher animals. Ensembl is
widely used by biomedical researchers, serving
around a million pages a week, and has been used
to generate gene sets for several genomes, including
human, mouse, rat and chicken. In a collaboration
with Lincoln Stein at the Cold Spring Harbor Laboratory
(NY, USA), Ewan’s team also produces Reactome
– a knowledgebase of human biological pathways.
Other collaborations include the ENCODE project,
a detailed gene anatomy of a specified region
of the human genome; and the BioSapiens Network
of Excellence. Ewan actively supports the open
source movement: he is co-leader of the open-source
bioinformatics toolkit Bioperl and president of
the Open Bioinformatics Foundation, which supports
the development of several bioinformatics toolkits.
talk title: Genomes
to Systems Biology
abstract: Modern
biology has been revolutionized by the sequencing
of genomes across the tree of life. However, this
immensely rich data has brought its own challenges.
These range from conceptually mundane and yet
critical engineering tasks through to genuine
changes in our scientific understanding of how
life works. In this talk I will present two projects,
Ensembl (www.ensembl.org)
and Reactome (www.reactome.org),
the first of which is focused on analyzing genome
sequence and the second which is a starting point
into building rich structures representing human
pathways on top of the genome.
biosketch:
Janet Thornton has been Director of the EBI since
October 2001. Her active research group focuses
on using computational approaches to understand
biology (especially proteins) at the molecular
level, and her research combines the use of genomic,
transciptomic, structural and metabolomic data
with the aim of discovering how molecules interact
to perform their functions, and how these functions
evolved. Under her directorship, the EBI has expanded
into several new research areas and has secured
funding to provide space for its burgeoning staff
base. She works tirelessly to raise awareness
of the need for a stable bioinformatics infrastructure
in Europe. BioSapiens, the European-Union-funded
Network of Excellence that she coordinates, is
enabling bioinformaticians throughout Europe to
work together and with experimental biologists
to annotate genome data. She is a Fellow of the
Royal Society, a Member of the European Molecular
Biology Organization, a Foreign Associate of the
US National Academy of Sciences and a Commander
of the British Empire.
talk title:
From Proteins to Life - Old and New Challenges
abstract: Since
the early days of my research, when 'bioinformatics'
was not yet a recognised discipline and almost
no biologists used computers, the challenge of
understanding how the sequence of a protein determines
its structure, and how each structure performs
its own biological function and works together
with other proteins to orchestrate life, was already
clearly stated. From having only 20 protein structures
when I started, to over 30,000 available in the
Worldwide Protein Databank (wwPDB) today, our
understanding has grown enormously, though the
original challenges still remain. Initially we
struggled even to find words and robust parameters
to describe the structures and to develop computer
tools to display, simplifying where appropriate,
and analyse these beautiful but complex arrangements
of atoms and molecules. New approaches were developed
to validate the structures (PROCHECK) and to compare
molecules quantitatively in three dimensions,
allowing for insertions, deletions and mutations.
Using our tools (Promotif), many analyses of motifs
were published, defining common patterns that
recur in proteins and may be markers of biological
function (e.g. metal binding sites) or structural
motifs that are energetically stable (e.g. b-turns).
From the beginning our approach contrasted with
that common amongst structural biologists who
were determining structures (a process which was
arduous and often took 5 years or more and many
graduate students!). We analysed many structures,
rather than focussing on a single protein or structure,
leading us to seek better ways to store and query
the data and thus to use relational databases
in the mid-80s. We despaired about the lack of
data consistency of the old PDB files and lack
of clarity in defining data items. Repeatedly
we tried to use the approaches common in physics
and chemistry to model structures, but were continuously
forced towards a more heuristic data-driven approach
by the complexity, size and subtlety of these
biological molecules and their interactions. Ultimately
this led us to develop a heuristic classification
of protein structure domains (CATH), partly to
organise the data and make it manageable, but
also to better understand how proteins evolve
to perform their functions. Today we focus increasingly
on understanding higher order complexes and especially
the relationship between structure and function.
The progress towards improvements
in handling, analysing and understanding the structural
data is mirrored in other types of data now available
to biologists (such as transcriptome and metabolome
data), even in other branches of science, like
astronomy, where stars are classified, or chemistry,
where molecular databases are essential. At the
EBI we are tackling all these issues for the core
biomolecular data resources we host, seeking to
improve data validation, quality, accessibility
and integration.
My initial studies have led me down
paths that were only distant dreams, when starting
out as an undergraduate physicist. Today we not
only consider structural and biophysical data,
but are drawn in to look at other high-throughput
data, such as expression data, metabolic data
and biological pathways and networks. Our goals
have broadened and become increasingly ambitious
in trying to use these data, not just to understand
about the molecules, but also to understand more
about complex biological systems, such as bacterial
evolution, catalysis, the molecular basis of diseases
and ageing.
In this award lecture, which I am
honoured to have been asked to present,
I shall look back over the major challenges and
developments we have faced in structural bioinformatics,
acknowledging the many scientists with whom I
have had the pleasure to collaborate and look
forward to our current interests and future challenges.
See http://www.ebi.ac.uk/Thornton
for references and summary of current research
biosketch:Howard Cash, President of Gene Codes and Gene
Codes Forensics, Inc., inc. Howard Cash was
born in Detroit, studied musical composition and
conducting at the University of Pennsylvania and,
after a period as Assistant Conductor with the
Pennsylvania Opera Theater, Psychoacoustics at
Stanford.
He has been at the forefront of
commercial bioinformatics development since 1984.
He joined IntelliGenetics where some of the seminal
biotech software tools were developed including
the "IG-Suite" set of DNA and protein
analysis modules and the "Stratagene"
expert system for clone management. In 1988, he
founded Gene Codes Corporation where he remains
as President and CEO. He designed and developed
the “Sequencher” program used in thousands
of academic and commercial DNA sequencing labs
in forty-four countries.
In 1997, Governor John Engler appointed
him to the Michigan State Commission on Genetics,
Privacy and Progress. The commission
recommended legislation on a host of issues related
to genetic information and privacy and Cash chaired
the committee on Property Rights, Ownership,
Collection, Use and Storage [POCUS].
All recommendations that have come from the thirteen-member
commission have been signed into State law.
Shortly after 9-11, Cash was asked
to put his company at the disposal of the New
York City Office of Chief Medical Examiner and
to develop new software for DNA analysis and data
handing for the purpose of identifying the remains
of those killed at the World Trade Center.
A new corporation called Gene Codes Forensics,
Inc. was formed to focus exclusively on this project.
It has been a daunting task from a technical standpoint,
and has also raised ethical and legal issues involving
jurisdiction, family rights and genetic privacy.
The Mass-Fatality Identification System ("M-FISys,"
pronounced like emphasis) was created
and remains the most advanced tool in the world
for combining DNA technologies for human identification
including autosomal Short Tandem Repeat [STR]
analysis, mitochondrial sequence profiling and
forensic SNP matching.
In January 2005, following the Boxing
Day earthquake and ensuing tsunami, the help of
Cash and Gene Codes Forensics were enlisted to
help identify those killed in Thailand.
Information technology tools developed for 9-11
are a tremendous advantage in the response to
this disaster, and political challenges have proven
to be greater than scientific ones.
Among many awards, Cash has received
the Arthur Anderson/MTC "Leading Edge Technology
Award" the prestigious Ernst and Young “Entrepreneur
of the Year” award for S.E. Michigan, the
“Person of the Year” award from Genome
Technology magazine, "Medal for Extraordinary
Service to Humanity" from the Bear Search
and Rescue Foundation, and in 2005, the Merlanti
Prize for "Best Practices in Business Ethics."
Cash has served on several boards,
including the Hot Springs Music Festival,
9-11 WVFA and CEBOS Corporation.
He is a member of the HUGO Council.
talk title: Biology
of Life and Death: Disaster, DNA and the Information
Science of Human Identification
abstract: I have
been working professionally in bioinformatics
since joining IntelliGenetics in 1984. That same
year, the remains of a U.S. serviceman from the
Vietnam War were interred in the Tomb of the Unknown
Soldier at Arlington National Cemetery. My interest
in the scientific niche of DNA forensics began
in the 1990's with speculation that the remains
of that soldier might be identified. Air Force
Lt. Michael J. Blassie was identified in June
1998 through mitochondrial DNA [mtDNA] testing
and returned to his family for burial in St. Louis.
By the time the Vietnam Unkown was
identified, I was at Gene Codes Corporation. There
we developed tools for mtDNA profiling which became
standards at major forensic biology centers such
as the Armed Forces DNA Identification Laboratory
[AFDIL], the FBI laboratory and the Institute
for Forensic Medicine in Innsbruck. The community
of forensic users was small, but the analysis
functions created to support sequencing for comparison
to a reference mtDNA sequence had other applications
including comparative genomics and clinical HIV
genotyping.
Forensics was a tiny part of our
work until Sept 11, 2001. When the World Trade
Center towers fell, it was initially thought that
five- to ten-thousand people might have been killed,
though the final number of fatalities is now believed
to be 2,749. Because of the sheer mechanical violence
of the collapse, nearly 20,000 samples were delivered
to the Medical Examiner's office for identification.
In most cases, DNA was the only possible way to
identify the remains, and existing DNA profile
matching tools were not designed to handle a problem
of this scale. Because we had both the domain
experience and the engineering capacity, we were
asked by the City of New York to make available
essentially all of our technical resources to
meet their needs for DNA profile information management.
The Mass-Fatality Identification
System, or "M-FISys" (pronounced like
"emphasis") was developed on a brutally
accelerated timeline using Extreme Programming
methodologies and close collaboration with forensic
biologists in the NYC Office of Chief Medical
Examiner [OCME]. Programming began in early November,
driven by constantly evolving priorities and needs
of the agency's front line scientists, the WTC
DNA Identification Unit. By December 12, 2001,
only 105 identifications had been made using DNA
methodologies. The next day, when the first version
of M-FISys was delivered to the OCME, 55 matches
were found that would be confirmed as new identifications
by Dr. Charles Hirsch, the city's Chief Medical
Examiner.
M-FISys continued its rapid development,
combining mtDNA sequencing with autosomal Short
Tandem Repeat [STR] analysis and more recently
autosomal SNP profiling and Y-STR typing. Since
persons would be identified either to direct references
(such as DNA recovered from a victim's toothbrush)
or familial profiles, both direct matching and
complex kinship analysis had to be supported.
As meta data errors were discovered (e.g., toothbrushes
brought in from the wrong family member, family
donors reporting erroneous blood relationships,
or commingled remains with multiple profiles)
we experienced a continuous race to implement
data QC tools to catch errors before they could
result in a misidentification. Badly degraded
samples were tested and retested with ever more
sensitive assays. The efforts were exhaustive
and it was not until February 2005 that the Medical
Examiner declared that every scientifically reasonable
attempt had been made to identify each bone and
tissue sample. 1,594 victims had been identified
and 10,769 individual remains, of which 9,728
(90.33%) could be identified by no means other
than DNA typing.
We believed that the close of the
World Trade Center effort meant a much needed
respite and a return to a normal work schedule
for my staff, but our rest was short lived.
The earthquake and tsunami in South Asia on Dec
26, 2004 presented new challenges in human identification,
and once again our phones began to ring. The DNA
analysis tools created to respond to a terrorist
attack would be applicable to a natural disaster,
but new analytical functions would be needed to
address new and different laboratory challenges
and to interact with the local systems.
This keynote address will cover
some of the software engineering methodologies,
the design and computational strategies, and the
startlingly intense geopolitical pressures that
characterized the efforts to apply dispassionate
scientific methods to terrible human tragedy.
It also highlights just one of the ways that the
field we all work in can dramatically impact organizations,
society and individuals.
biosketch:
Peter Hunter completed an engineering degree in
1971 in Theoretical and Applied Mechanics at the
University of Auckland, New Zealand, a Master
of Engineering degree in 1972 (Auckland) on solving
the equations of arterial blood flow and a DPhil
(PhD) in Physiology at the University of Oxford
in 1975 on finite element modeling of ventricular
mechanics. His major research interests since
then have been modelling many aspects of the human
body using specially developed computational algorithms
and an anatomically and biophysically based approach
which incorporates detailed anatomical and microstructural
measurements and material properties into the
continuum models. The interrelated electrical,
mechanical and biochemical functions of the heart,
for example, have been modelled in the first 'physiome'
model of an organ. As the current co-Chair of
the Physiome & Bioengineering Committee of
the International Union of Physiological Sciences
he is helping to lead the international Physiome
Project which aims to use computational methods
for understanding the integrated physiological
function of the body in terms of the structure
and function of tissues, cells and proteins. He
established the first undergraduate biomedical
engineering program in New Zealand in 2000 and
the Bioengineering Institute in 2001. He is currently
Director of the Bioengineering Institute at the
University of Auckland and Director of Computational
Physiology at Oxford University.
talk title: Computational
Physiology and the IUPS Physiome Project
abstract: The International
Union of Physiological Sciences (IUPS) Physiome
Project is an internationally collaborative open-source
project to provide a public domain framework for
computational physiology, including the development
of modeling standards, computational tools and
web-accessible databases of models of structure
and function at all spatial scales [1,2,3]. It
aims to develop an infrastructure for linking
models of biological structure and function across
multiple levels of spatial organization and multiple
time scales. The levels of biological organisation,
from genes to the whole organism, includes gene
regulatory networks, protein-protein and protein-ligand
interactions, protein pathways, integrative cell
function, tissue and whole heart structure-function
relations. The whole heart models include the
spatial distribution of protein expression.
The project requires the creation
of web-accessible databases of mathematical models
of structure and function at spatial scales which
encompass nano-scale molecular events to the meter
scale of the intact heart and torso, a range of
109, and temporal scales from Brownian motion
(microseconds) to a human lifetime (109s), a range
of 1015. Clearly this cannot be represented by
one model but rather a hierarchy of models and
modeling approaches such as stochastic models
of ion channels and receptors for ligand binding
calculations, ordinary differential equation lumped
cell models, and partial differential equation
continuum models at the tissue and organ levels.
It also requires the model parameters at one scale
to be linked to detailed models of structure and
function at a smaller spatial scale – hence
the need for "multi-scale modeling."
The long term challenge for the
Physiome Project is to build a modeling framework
in which the effect of a gene mutation can be
modeled all the way from its effect on protein
structure and function to how the altered properties
of the protein affect a cellular process such
as signal transduction, and how the changed properties
of that process alter the function of tissues
and organs. There will be many other benefits
from this integrative framework. Understanding
how model parameters are affected by individual
variation, by embryological growth, by ageing
and by disease, for example, will bring benefits
to the design of medical devices, the diagnosis
and treatment of disease and the development of
new drugs.
References
Hunter, P.J., Robbins. P.
and Noble, D. The IUPS Human Physiome Project.
European Journal of Physiology. 445 (1), 1-9,
2002.
Hunter, P.J. and Borg, T.K.
Integration from proteins to organs: The Physiome
Project. Nature Reviews Molecular and Cell Biology.
4, 237-243, 2003.
Crampin, E.J., Halstead,
M., Hunter, P.J., Nielsen, P.M.F., Noble, D.,
Smith, N.P.and Tawhai, M. Computational physiology
and the Physiome Project. Exp. Physiol. 89,
1-26, 2004.
biosketch:
Jill P. Mesirov is associate director and chief
informatics officer of the Broad Institute of
MIT and Harvard where she directs the Bioinformatics
and Computational Biology Organization. She is
also adjunct professor of bioinformatics at Boston
University.
Mesirov is a computational scientist
who has spent many years working in the area of
high performance computing on problems that arise
in science, engineering, and business applications.
Her current research interest is computational
biology with a focus on algorithms and analytic
methodologies for pattern recognition and discovery
with applications to cancer genomics, genome analysis
and interpretation, and comparative genomics.
In addition, Mesirov is committed to the development
of practical, accessible software tools to bring
these methods to the general biomedical research
community.
Mesirov came to the Whitehead Institute/MIT
Center for Genome Research, now part of the Broad
Institute, in 1997 from IBM where she was manager
of computational biology and bioinformatics in
the Healthcare/Pharmaceutical Solutions Organization.
Before joining IBM in 1995, she was Director of
Research at Thinking Machines Corporation for
ten years.
Mesirov is a trustee of the Institute
for Defense Analyses, Vice Chair of the Interoperable
Informatics Infrastructure Consortium (I3C) and
a member of review committees for the Department
of Energy’s Argonne and Los Alamos National
Laboratories. She is a fellow of the American
Association for the Advancement of Science, and
serves on numerous academic and corporate scientific
advisory and journal editorial boards.
talk title: Gene
Expression Analysis: A Knowledge-based Approach
abstract: DNA microarrays
now make it possible to capture the expression
pattern of all the genes in the genome in a single
experiment. Genome-wide expression analysis is
at the heart of global genomic approaches to biomedical
research and appears in over 1000 published papers
a year. The challenge that now faces us is not
obtaining these molecular profiles, but interpreting
them to gain a better understanding of underlying
biological processes.
We will describe how prior biological
knowledge can be incorporated into a robust, quantitative
approach for analyzing mRNA profile data and used
to shed light on the mechanisms of disease.
biosketch:
Satoru Miyano, Ph.D., is a Professor of Human
Genome Center, Institute of Medical Science, University
of Tokyo. He obtained Ph.D. in Mathematics from
Kyushu University in 1984. His research group
is developing computational methods for inferring
gene networks from microarray data and other biological
data, e.g., protein-protein interactions, promoter
sequences. The group has also developed a software
tool called Genomic Object Net for modeling and
simulation of various biological systems. This
software is now commercialized as Cell Illustrator.
Currently, his research group is intensively working
for developing the gene network of human endothelial
cell by knocking down hundreds of genes. With
these technological achievements, his research
direction is now heading toward a creation of
Systems Pharmacology.
talk title: Computational
Challenges for Gene Networks
abstract: Gene
networks play a central role in systems biology.
This talk presents two computational approaches
related to gene networks.
First, computational methods for
estimating gene networks from microarray gene
expression data are presented. We consider microarray
data obtained by various perturbations such as
such as gene disruptions, shocks, drug responses,
time-course measurements, etc. The idea is to
combine the Bayesian network approach with nonparametric
regression, where genes are regarded as random
variables and the nonparametric regression enables
us to capture from linear to nonlinear structures
between genes. As a criterion for choosing good
networks, we defined an information criterion
called the BNRC (Bayesian network and Nonparametric
Regression Criterion) score. Naturally, the sole
use of microarray data has limitations on gene
network estimation. For improving the biological
accuracy of estimated gene networks, we made a
general framework by extending this method for
using genome-wide other biological information
such as sequence information on promoter regions
and protein-protein interactions. The problem
of finding optimal Bayesian networks is known
computationally intractable. We also developed
an algorithm for searching and enumerating optimal
and suboptimal Bayesian networks in feasible time
on supercomputers. Computational experiments with
this search algorithm have provided evidences
of the biological rationality of our computational
strategy. Then gene networks were applied for
searching drug target genes. By exploring gene
networks estimated from microarray data based
on gene disruptions and drug doses, a novel drug
target gene was identified and validated. For
this purpose, we developed a software for visualizing
and analyzing gene networks which played an important
role in discovery. This suggests that our gene
network approach can be a strong tactics for searching
drug target genes.
Second, a software tool for modeling
and simulating gene networks which is based on
the notion of Petri net is presented. Obviously,
an important challenge is to create a software
platform with which scientists in biology/medicine
can comfortably model and simulate dynamic causal
interactions and processes in the cell(s). For
this direction, we developed a software Cell Illustrator
(http://www.gene-networks.com)
which uses the notion of Hybrid Functional Petri
Net with extension (HFPNe) as its architecture.
Cell Illustrator has a biology-oriented GUI and
we can make modeling of very complex biological
processes like a drawing tool. Further, we can
create a personalized visualization of simulation
by developing an XML document for animation. Its
effectiveness has been demonstrated by modeling
various biological processes.
biosketch:
Pavel Pevzner holds the Ronald R. Taylor Chair
in Computer Science. He joined the UCSD faculty
in 2000, following five years in the University
of Southern California's Mathematics and Computer
Science departments. From 1992-95, he was an associate
professor at Pennsylvania State University. >From
1990-92 Pevzner was a postdoctoral researcher
at USC. He received his Ph.D in 1988 from the
Moscow Institute of Physics and Technology. Pevzner
is the author of the book "Computational
Molecular Biology: An Algorithmic Approach"
(MIT Press, 2000) and also "Introduction
to Bioinformatics Algorithms", co-authored
with Neil Jones (MIT Press, 2004). He is an executive
editor of the "Journal of Computational Biology,"
and co-founder of the International Conference
on Research in Computational Biology (RECOMB).
talk title: Transforming
Mice into Men: Fragile versus Random Breakage
Models of Chromosome Evolution
abstract: Despite
some differences in appearance and habits, men
and mice are genetically very similar. In a pioneering
paper, Nadeau and Taylor, 1984 estimated that
surprisingly few genomic rearrangements (about
200) have happened since the divergence of human
and mouse 75 million years ago. The genomic sequences
of human and mouse provide evidence for a larger
number of rearrangements than previously thought
and shed some light on previously unknown features
of mammalian evolution. In particular, they provide
evidence for extensive re-use of breakpoints from
the same relatively short regions and reveal a
great variability in the rate of micro-rearrangements
along the genome. Our analysis also implies the
existence of a large number of very short "hidden"
synteny blocks that were invisible in comparative
mapping data and were ignored in previous studies
of chromosome evolution. These results suggest
a new model of chromosome evolution that postulates
that breakpoints are chosen from relatively short
fragile regions that have much higher propensity
for rearrangements than the rest of the genome.
biosketch:
Gunnar von Heijne has a long-standing interest
in membrane proteins, and has in particular contributed
to the understanding of their membrane assembly
and topology. In addition to experimental molecular
biology work, he has also taken part in the development
of widely used bioinformatics prediction methods
such as SignalP, TargetP, TopPred and TMHMM. He
has published around 240 scientific articles,
and is listed in the ISI Highly Cited’ database.
talk title: Membrane
Proteins in vivo and in silico
- Getting the Best of Two Worlds
abstract: Membrane
protein research has gained a lot of momentum
in recent years: high-resolution structures are
being produced at an increasing rate, membrane
proteomics is coming on line, and membrane proteins
are recognized as drug targets of major importance.
Bioinformatics has always been an integral part
of the developments in the field, and today provides
the tools necessary to identify the membrane complement
of proteomes and to predict topologies and –
in lucky cases – full 3D models of membrane
proteins.
As in so many other areas, much
is to be gained from a tighter integration between
bioinformatics and experimental studies of membrane
proteins. In our own work, we are reaching both
towards proteome-wide studies of membrane proteins
and towards a quantitative understanding of the
cellular processes underlying the integration
of proteins into biological membranes; in both
cases, experimental and theoretical approaches
must be combined to push forward.
References
Hessa, T., Kim., H., Bihlmaier,
K., Lundin, C., Boekel, J., Andersson, H., Nilsson,
I.M., White, S.H., and von Heijne, G. (2005)
Recognition of transmembrane helices by the
endoplasmic reticulum translocon. Nature 433,
377-381.
Daley, D.O., Rapp, M., Granseth,
E., Melén, K., Drew, D., and von Heijne,
G. (2005) Global topology analysis of the Escherichia
coli inner membrane proteome. Science, in press.