We have been working to establish the comprehensive mouse full-length cDNA collection and sequence database to cover as many genes as we can, named Riken mouse genome encyclopedia. Recently we are constructing higher-level annotation (Functional ANnoTation Of Mouse cDNA; FANTOM) not only with homology search based annotation but also with expression data profile, mapping information and protein-protein database. More than 1,916,592 clones prepared from 267 tissues were end-sequenced to classify into 171,144 clusters and 60,770 representative clones were fully sequenced. As a conclusion, the 60,770 sequences contained 33,409 unique sequences with more than 18,415 clear protein-encoding genes, contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome.
In this report, we propose a new concept of "Transcriptional Unit" comprising both protein coding and non-coding messages. This is more computational and unequivocal word than "gene" and "locus" which cause the confusion due to their ambiguous definition. In this analysis, more than 2,000 sense-antisense transcript pairs were discovered and 41% of all transcripts are alternatively spliced.
The next generation of life science is clearly based on all of the genome information and resources. Based on our cDNA clones we developed the additional system to explore gene function. We developed cDNA microarray system to print all of these cDNA clones, protein-protein interaction screening system, protein-DNA interaction screening system and so on. The integrated database of all the information is very useful not only for analysis of gene transcriptional network and for the connection of gene to phenotype to facilitate positional candidate approach. In this talk, the prospect of the application of these genome resourced should be discussed.
More information is available at the web page: http://genome.gsc.riken.go.jp/