ISMB/ECCB 2013 features two (2) half-day tutorial sessions on Saturday, July 20, 2013 one day prior to the conference scientific program. Tutorials are held on the same day as the second day of the SIGs, Satellite and Junior Principal Investigator meetings.
Computational mass spectrometry-based proteomics
Jürgen Cox
Max Planck Institute of Biochemistry
Martinsried, Germany
Recent revolutionary advances in high accuracy mass spectrometry-based proteomics are providing a new basis for data-driven systems biology. Comprehensive quantification of whole proteomes is becoming feasible, but, due to inherent high data volumes, places high demands on the data processing. We describe algorithms and workflows encompassing the whole workflow of mass spec data analysis from intelligent data-driven acquisition, over algorithms for the identification and quantification of proteins up to the statistical analysis of the final expression data for proteins and posttranslational modifications in the context of other omics and pathway data. The focus is on the MaxQuant software, its usage, theory behind it and its application to complex experimental designs relevant for systems biology. The tutorial is divided into three sections which are supplemented by practical exercises that the participants can work through on their own laptops as time permits, or may be taken home. The outline is as follows:
Cox, J. and Mann, M. (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature Biotech 26, 1367-72.
Cox, J. and Mann, M. (2011). Quantitative, high resolution proteomics for data-driven systems biology. Annual Review of Biochemistry 80, 273-299.
Hein, M.Y., Sharma, K., Cox, J. and Mann, M. (2013) Proteomic analysis of cellular systems. in Handbook of Systems Biology, eds. Walhout, Vidal, Dekker. Academic Press.
Note:Â Please bring your own computer
Encode data access (no controversy)
Osvaldo Graña, and David G. Pisano
Bioinformatics Unit, Structural Biology and Biocomputing Programme
Spanish National Cancer Research Centre (CNIO), Madrid
Â
The Encode project was devised to characterize all the functional elements in the human genome. It started with a pilot phase in 2003, where the consortium researchers studied the 1% of the genome. In 2007 they expanded this search to the full genome, in a second project called the production phase. In September 2012 all the results obtained from this second phase were published by the consortium in a series of 30 papers in Nature, Genome Research and Genome Biology. The investigations performed by the Encode consortium involved not only the human genome, but also the mouse genome, the fruit fly genome and the worm genome. These last two genomes were the focus of the modEncode subproject.Â
Encode took advantage of the last sequencing technologies to perform global studies across the genome, including the creation of the most complete annotation set of genes and gene isoforms (coding genes, ncRNAs and pseudogenes) released up to date. The experiments also measured transcriptional levels of genes in a strand specific manner; revealed the structure and remodeling of chromatin based on chemical modifications in histones and the presence of DNA hypersensitivity sites; characterized the locations and nature of thousands of binding sites for dozens of transcription factors; determined the targets for certain RNA binding proteins, DNA methylation levels, variations in copy number of DNA segments, characterization of transcription start sites and subcellular localization of RNAs, among many other findings. All of these experiments were peformed on a battery of more than 147 cell types, divided in three different tiers.
While the Encode investigations were taking place, huge amounts of data were continuously generated. The entity  designated as the data coordination center for human and mouse data was the UCSC Genome Browser. This tutorial focuses on the accession and the analysis of the information generated by Encode. We will show how the information is structured, what can be done online, and how to download the data to analyze it locally. The attendants will have the opportunity to work with bulk  data from the different experiments, select pieces of Encode data from specific regions in the genome, check Encode annotations for particular regions of interest, transfer Encode annotations from one genome assembly to another, or learn how to jointly analyze data from their own experiments together with Encode information. We will review some of the methods used to analyze the data in the projects, and the attendants will have the opportunity to analyze the project data or their own data with the same tools.
Note: Please bring your own computer