Prediction of snoRNAs in Human Genome

Sagara Jun-Ichi1, Asai Kiyoshi2, Nakamura Shugo, Kenmochi Naoya
1jun@ni.aist.go.jp, CBRC; 2, CBRC

Small nucleolar RNAs (snoRNAs) are taken part in processing and base modification (2'-O-ribose methylation and pseudouridylation) in precursor ribosomal RNA (pre-rRNA). The box C/D snoRNAs, which direct 2'-O-ribose methylation, and the box H/ACA snoRNA, which direct pseudouridylation, contain the characteristic motifs and have complementarities to conserved sequences in mature rRNAs. In eukaryotes, certain snoRNAs are transcribed from typical promoters. In vertebrates, the majorities are encoded in introns of protein-coding genes, and are released by exonucleolytic cleavage of linearized intron lariats. By contrast, in yeast, nonintronic snoRNA gene clusters are transcribed as polycistronic pre-snoRNA transcripts from which individual snoRNAs are processed.

Until now, Snoscan* which is computational screening method developed by T. Lowe et al. utilize conserved elements of sequence and structure to identify snoRNAs. The method consists of probabilistic modeling methods, stochastic context-free grammers (SCFGs) and hidden malkov models (HMMs). Using that approach, they produced integrated model of snoRNAs which is based on the sequences features specific to this RNA gene family.

In this research, we predict snoRNAs in human genome using several methods for sequence analysis such as statistical methods and probabilistic models. We also develop a Predicted Human Intron database produced from exons predicted by Gene Decoder** which is a gene finding technology based on HMMs. We show the results of prediction of snoRNAs and the databases of human intron in ismb2003 poster session.


To whom correspondence should be addressed. E-mail: jun@ni.aist.go.jp *http://rna.wustl.edu/snoRNAdb/
**http://genedecoder.cbrc.jp/