|
 |
Photo courtesy of
Princeton University, Office of Communications. |
Overton Prize: Olga Troyanskaya
By Justin Mullins, Freelance Science
Writer, London,
United Kingdom
In the spring of 1997, Olga Troyanskaya was working on a degree in
computer science and biology at the University of Richmond,
Virginia, when she contacted Steven Salzberg, then at Johns Hopkins
University, about a summer internship in his lab devoted to
computational biology. "He took a chance on me- a random student from
another school- and was tremendously inspirational," she says. She
spent the following two summers working in Steven Salzberg's
laboratory, first at Johns Hopkins and then at The Institute for
Genomic Research.
And so began the career of one of the most
promising young researchers in bioinformatics, and a deserving
winner of this year's Overton Prize. "She is one of these forces of
nature, full of energy," says Alfonso Valencia, chair of the ISCB
awards committee.
Troyanskaya herself talks with infectious
enthusiasm about her work. "I've always been fascinated by the
problems of biology," she says. "I was just better at computer
science and math than the wet lab research. And it seemed to me that
there had to be a lot you could contribute with computer science
that you couldn't do with experimental techniques alone."
From the University of Richmond, Troyanskaya moved to Stanford
University to complete a PhD in biomedical informatics, under the
supervision of Russ Altman, a bioinformatician, and David Botstein,
a geneticist. "I wanted a setup that was close to real biological
problems, and I got exactly that. I learned a great deal from both
of them," she says.
In 2003, she moved to Princeton
University as an assistant professor in the Department of Computer
Science and the Lewis-Sigler Institute for Integrative Genomics. "I
am fortunate that the computer science department appreciates the
impact of computing in biology, and that I have many wonderful
colleagues at both the department and in the Institute. I found
several amazing collaborators, and this allowed me to begin a number
of interesting projects."
One of the key problems she focuses
on is making better use of the vast but unwieldy biological datasets
in databases around the world. "So instead of focusing on one study,
we can take the entirety of published data. That allows you to ask
very specific questions in a data-driven way and to develop novel
biological hypotheses," she says.
An important goal is to
predict the function of genes or proteins. There have been many
experimental approaches to determine what genes do and how they are
controlled inside the cell. But this work tends to produce datasets
that are large and noisy. Troyanskaya's approach is to develop new
ways for extracting useful information from these datasets using
techniques from computer science such as machine learning and data
mining.
"Computation by itself is often not enough to
discover new biology but it can direct experimental work," she says.
And she has set up a wet lab to help test and validate the
hypotheses that the computer science helps generate. In 2009, for
example, she used this approach to identify 109 new proteins
involved in mitochondrial biogenesis in yeast.
This combined
approach is one of the things that sets Troyanskaya apart, says
Valencia. "She is one of the first to have come from the
computational side and then moved into the experimental area to
combine both," he says.
Understanding the function of
individual genes is only a small part of a much bigger story. Many
genes and proteins play multiple roles within a cell as parts of
various networks of biological processes. Mapping out these networks
and understanding how they work and interact with each other is yet
another strand of her research. "She has made important
contributions to systems biology," says Valencia.
The process
of evaluating and validating computational predictions is an area
requiring a broad collaboration to develop standards and methods
that can be used to achieve a consensus about the results. To this
end, Troyanskaya is collaborating with the curators of model
organism databases and members of the Gene Ontology Consortium.
Another problem that many researchers face is handling the data
avalanches currently being generated. So Troyanskaya, in
collaboration with Princeton colleagues Kai Li and Moses Charikar,
is looking at ways to better search and visualise these huge
datasets, something that is challenging because of high noise levels
and the enormous volume of the data. "We are developing better ways
to do this," she says.
The awards committee was also
impressed by Troyanskaya's service for the community. She is
involved in the Society's two official journals, PLoS
Computational Biology and Bioinformatics. And she is
involved in conferences: organizing, chairing tracks and program
committees. "That is something that is very much appreciated," says
Valencia. "We are lucky to have her."
And there is surely
more to come. Troyanskaya points to numerous questions that are
driving her research forward. She wants to know, for example, how we
can predict which genes are involved in kidney disease, to
understand their function and their clinical role on a molecular
level. She works on these questions in close collaboration with
experimental researchers, such as Matthias Kretzler and his group
from the University of Michigan, Ann Arbor. And she is passionate
about finding ways to ask questions in a data-driven way, not just
in a knowledge-driven way that relies on what we already know about
biology. "These are the questions that I'm really interested in,"
she says. "And we really haven't yet harnessed the full potential of
our data collections."
This article is excerpted from the
June 2011 issue of PLoS Computational Biology. To link to
the full journal article please visit
www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1002081
 |