Sunday Poster Presentations: Methods

53 - New Visualization Tools for Biomolecular Sequence Analysis
Dimitris Anastassiou, Columbia University
We introduce new computational and visual tools for frequency-domain biomolecular sequence analysis, improving upon traditional Fourier analysis performance in distinguishing coding from noncoding regions in DNA sequences. Color maps identify not only the existence of protein coding areas, but also the coding direction and the reading frame for each of the exons, from the phase of the Fourier Transform.

54 - Clustering and Averaging of Images in Single Particle Analysis
Kiyoshi Asai, Yutaka Ueno, Chikara Sato, Electrotechnical Laboratories; Katsutoshi Takahashi, Japan Advanced Institute of Science and Technology Hokuriku
We have been developing a single particle analysis system that estimates the 3-D structures from randomly oriented electron-microscopic images. Iterative alignments and bottom-up algorithms with large computational power are the keys for the robust clustering, which require no manually designed reference.

55 - The NOESY Jigsaw: Automated Protein Secondary Structure and Main-chain Assignment from Sparse, Unassigned NMR Data
Chris Bailey-Kellogg, Alik Widge, John J. Kelley, III, Marcelo J. Berardi, John H. Bushweller, and Bruce Randall Donald, Dartmouth College
We present a novel algorithm for high-throughput automated assignment of nuclear magnetic resonance spectra. Our approach identifies secondary structure patterns in unassigned data, and then aligns identified elements against the primary sequence. By deferring the traditional assignment bottleneck, our approach achieves fast, reasonably accurate results using only four spectra.

56 - Improved Techniques for Finding Spots on DNA Microarrays
Jeremy Buhler, Trey Ideker, David Haynor, University of Washington
Dapple is a program for finding and quantitating spots on fluorescent cDNA microarrays. Dapple's spot finder exploits consistent spot morphology to improve its robustness to image artifacts and variations in spot size and placement. The finder can be trained on manually classified examples to identify poor-quality and incorrectly found spots.

57 - Natural Language Processing for Remote Homology Detection
Jeffrey Chang, Russ Altman, Stanford University
Biology is an ideal field for the application of natural language processing. The enormous amount of literature being generated exceeds the capacity of humans to interpret. Thus, we are working on methods to improve automated remote homology detection of protein sequences using unstructured text information available in Medline abstracts.

58 - Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data
Jane Fridlyand, University of California, Berkeley, Sandrine Dudoit, Mathematical Sciences Research Institute, Berkeley, Terence P. Speed, University of California, Berkeley
We compare the performance of different discrimination methods for the classification of tumors based on gene expression profiles. These methods include traditional approaches as well as machine learning techniques. The methods are applied to three recently published datasets: the leukemia (ALL/AML) dataset of Golub et al. (1999), the lymphoma dataset of Alizadeh et al. (2000), and the 60 cancer cell line (NCI 60) dataset of Ross et al. (2000).

59 - Optimal Sequencing by Hybridization in Rounds
Alan M. Frieze, Bjarni Halldorsson, Carnegie Mellon University
We present an algorithm where we assume that SBH chips can be constructed interactively and the results of one hybridization experiment can be used to construct another SBH chip. We present experimental results as well as algorithmic analysis showing the algorithm is optimal in an information theoretical sense.

Poster 60 withdrawn by author.

61 - Development of a Modular Gene Index for the Rational Design of cDNA Microarray Probes
Jason Goncalves, James R. Woodgett, A. Jamie Cuticchia, University of Toronto
We describe a novel method to organize nucleotide sequence information for splice isoform discovery and the rational design of cDNA microarrays. Modular clustering is a novel method to organize nucleotide sequence information. Based on this work PCR primers will be generated to amplify microarray probes to unique splice variants of unique genes.

62 - New Learning Method to Improving Protein Identification from Peptide Mass Fingerprinting
Robin Gras, Elisabeth Gasteiger, Swiss Institute of Bioinformatics; Bastien Chopard, Department of Computer Science, Switzerland; Markus Müller, Ron D. Appel, Swiss Institute of Bioinformatics
We developed an algorithm to identify proteins by peptide mass fingerprinting. The masses and environmental data are used to search in sequence database. We use a score providing a ranking according to the quality of the match of environmental parameters. We compute the weights of parameters in the score using a genetic algorithm.

63 - Improving Base Calling Accuracy by Peak Space Equalization
Hong Guo, Mark Welsh, Steve Gold, CuraGen Corporation
DOLPHIN is a trace processor that manipulates raw electrophoretic traces for PHRED basecalling. The peak density for a trace follows an exponential decay, yet PHRED requires evenly spaced peaks to perform basecalling. An algorithm has been developed to equalizes the peak spacing. It is shown to improve PHRED scores dramatically.

64 - Mel4D: A Web-based Application for the Visualization and Animation to Proteins
Jakob Halaska, Stockholm University
Mol4D provides a shortcut to generate molecular visualizations and animations of proteins, ranging from static to simulated structures, and simultaneously connecting many levels of information into a single multidimensional visualization. We describe this graphical approach of spatiovisual mapping --how biological information originating from multiple sources can be visualized, correlated, and presented in a 3-D or 4-D environment. http://www.biokemi.su.se/Mol4D

65 - An Improved Method for Prediction of Translation Initiation Sites (TIS) in Human cDNAs and ESTs
Artemis G. Hatzigeorgiou, Synaptic Ltd., Greece
The presented method is based on statistics and artificial neural networks. It consists of two modules: one sensitive to the conserved motif before the TIS and one sensitive to the coding/non-coding potential around the TIS. These predictions are integrated in an algorithm that simulates, in a simplified form, the ribosome scanning model. It leads to 95% prediction of TIS on cDNAs and more than 50% on ESTs.

66 - Application of Parallel Techniques for Likelihood Estimation in Linkage Analysis
Ina Koch, Klaus Rohde, Jens G. Reich, Max-Delbrueck-Center for Molecular Medicine, Germany
We implemented two parallel versions of the adaptive simulated annealing algorithm - ASA. For problems with sufficiently complicate optimization functions a significant improvement of the runtime results in the parallelized versions can be shown.

67 - Time-frequency Analysis of Protein NMR Data
Christopher James Langmead, Bruce Randall Donald, Dartmouth College
Time-frequency analysis of NMR data exposes behavior orthogonal to the magnetic coherence transfer pathways, thus affording new avenues of NMR discovery. In particular, we demonstrate the heretofore unknown presence of inter-atomic distance information within ^{15}N-edited heteronuclear single-quantum coherence (^{15}N HSQC) data.

68 - Machine Learning Approaches to Basecaller Calibration
Michael Molla, The University of Wisconsin, Madison
"Basecalling" turns a sequencing reaction into "A"s, "G"s, "T"s and "C"s. Sequencing machines already do most of this work for us. However, whenever a new sequencing machine is developed, painstaking expert calibration is required. This is a study of the effectiveness of various machine learning techniques in solving this problem.

69 - APES --Automated Processing and Extraction of Sequences
Jurgen Pletinckx, Algonomics N.V., Belgium; Antoine Janssen; Jan van Oeveren, Keygene N.V., The Netherlands; Philippe Stas; Ignace Lasters, Algonomics N.V.; René van Schaik, Keygene N.V.
APES extracts sound nucleotide sequences from tracing files. Using modified alignment algorithms, flanking nucleotide sequences are localised. Fragments are trimmed and assembled. A Web interface graphically displays sequences, their quality, the location of flanking sequences, and downloadable sequence files (FASTA). APES output is prepared for use by automated annotation tools.

Poster 70 withdrawn by author.

Poster 71 withdrawn.

72 - Contextual Biomedical Image Recognition: Application to White Blood Cell Differentiation
Xubo (Beth) Song, Oregon Graduate Institute of Science and Technology
We present a novel system for automatic recognition of biomedical images, including white blood cell images and microscopic urinalysis images. The system consists of three major steps: image processing and feature extraction, pre-classification based on artifitial neural networks, and refined classification by incorporating contextual information.

73 - Automated Image Analysis for Hybridisation Experiments
Matthias Steinfath, Wasco Wruck, Max-Planck-Institute for Molecular Genetics; Henrik Seidel, Schering AG; Hans Lehrach, Uwe Radelof, Max-Planck-Institute for Molecular Genetics; John O'Brien, Royal College of Surgeons in Ireland
An image analysis for hybridisation experiments is presented. This procedure is fully automated in contrast to semiautomated programs that incorporate user interaction. Two different kinds of experimental data have been analysed: Oligonucleotide fingerprinting data and gene expression data. We successfully applied our image analysis software to several hundreds of images.

74 - Missing Value Estimation Methods for DNA Microarrays
Olga Troyanskaya, Michael Cantor, Orly Alter, Gavin Sherlock, Pat Brown, David Botstein, Robert Tibshirani, Trevor Hastie, Russ Altman, Stanford University
We present a comparative study of several missing value estimation methods in gene microarray data. A Singular Value Decomposition based method, weighted K-nearest neighbors, and row averaging were evaluated. We report results of the comparative experiments, providing recommendations for accurate estimation of missing microarray data under multiple sets of conditions.

75 - Simulation and Characterization of Hybridization Signals in Oligonucleotide Fingerprinting Using Linear Models
Christoph K. Wierling, Michael Janitz, Ralf Herwig, Max-Planck Institut für Molekulare Genetik; Stefan Haas, Theoretische Bioinformatik; Hans Lehrach, Uwe Radelof; Max-Planck Institut für Molekulare Genetik
We present a simulation tool of the oligonucleotide fingerprinting written in the object-oriented programming language Python. Computation of hybridization signals is based on a linear model with consideration of single mismatches. Hybridization characteristics of oligonucleotides used in our experiments with human I.M.A.G.E. clones were derived. Reliability was tested by corresponding simulations.