ISMB2010 - Workshops

Workshops

Workshop 1: Amazon Cloud Computing
Workshop 2: Bionformatics Core Facilities
Workshop 3: Integrated Genomic Analysis of TCGA Data
Workshop 4: Evidence Codes: Assay Tracking, ECO OBI
Workshop 5: Where and How to Get Published

Workshop 1: Amazon Cloud Computing

Organizer(s): Chris Dagdigian, Principal Consultant - BioTeam Inc. This e-mail address is being protected from spambots. You need JavaScript enabled to view it

http://www.bioteam.net

Date: Sunday July 11, 2010

Start Time: 10:45 a.m. - 12:40 p.m.

Room: 311

Bloc 1: Amazon Web Services Overview

Bloc 2: Informatics in the cloud: the good, bad & ugly

Bloc 3: Best Practices & Lessons Learned

Bloc 4: Live pipeline/workflow demo

Amazon Web services ("AWS") is the current leader in the segment of cloud computing focused on providing infrastructure as a service ("IAAS"). Based on our own consulting work and training requests we know that this is a huge area of interest for life science organizations. We've also spent years working on AWS and know quite well what works and what does not. We have lots of lessons learned, war stories and real world anecdotes to share. The main focus would be on explaining the Amazon framework and then moving on to a discussion on what applications and codes lend themselves well to the cloud. Also discussed will be about the problems and when it does not make sense to cover the cloud. Finally we'd end with a live demo showing a real scientific workflow operating in Amazon and provide details on how it was designed, deployed and operated.

Workshop 2: Bioinformatics Core Facilities

Date: Sunday July 11, 2010

Start Time: 2:30 p.m. - 4:25 p.m.

Room: 311

Topic 1. Analysis of Large Data sets

This topic will explore practical aspects of analysis of using large data sets including quality control, study design and integrating disparate data sets.

Moderator – David Sexton, Vanderbilt University, USA

Speaker 1 - Dawei Lin, UC, Davis, USA

Best Practices working with large dataset analysis (SNP calling, mRNA-seq, ChIP-Seq) for example, how to quality control, data trimming and filtering, Parameter selection.

Impact of Study Designs on the Relative Efficiency and Error Rates of Next Generation Sequencing Studies

Speaker 2 -Vared Caspi, Ben-Gurion University Negev, Israel

Integrating data and meta-analysis of publicly available expression array data, looking for shared affected pathways (or any other biological themes) among different experimental conditions

After a 5 minute break, there will be 25 minutes of discussion.

Topic 2. Technical - Managing large data sets in core facilities

This topic will discuss issues confronting core facilities related to data collection and handling, and what data should be stored.

Moderator – Simon Andrews, Babraham Institute, Cambridge, UK

Speaker 1– Mario Caccamo, The Genome Analysis Centre, Norwich, UK

Developing software solution to set up data handling pipeline.

Speaker 2- Hemant Kelkar, U of North Carolina, USA

Best practices in handling data being generated in a core sequencing facility.

After a 5 minute break, there will be 25 minutes of discussion.

Workshop 3: Integrative Genomic Analysis of TCGA Data

Date: Monday July 12, 2010

Start Time: 10:45 a.m. - 12:40 p.m.

Room: 311

Speakers:

10:45 Data session. Derek Chiang (UNC) and Gad Getz (Broad)

11:15 Ovarian and Endometrial: Doug Levine (MSKCC)

11:45 Lung: Matthew Meyerson (Broad)

12:15 Pancreas and Kidney: Paul Spellman (LBL)

Outline:

We propose to hold a workshop to difficulties in the field of integrative cancer genome analyses in regards to large scale systematics datasets. We will frame this conversation on data from the Cancer Genome Atlas (TCGA) and to provide assistance to the computational biologists attending ISMB in their efforts to use TCGA data. The workshop will consist of four 20 minute sessions with five minutes for questions and answers. These sessions will include, one covering TCGA data types/access and the remaining covering three separate diseases with a focus on what is known and what key questions remain to be answered from the data from both biological and clinical perspectives. We plan on the disease workshops to cover Ovarian Serous Carcinoma, Enometrial Carcinoma, Pancreatic Andenocarcinoma, Renal Carcinoma, and Lung Carcinoma. The ultimate purpose in holding this workshop is to recruit outside investigators into the analysis of TCGA data and to facilitate these investigations by highlighting what is presently known, and more importantly, what is not known and remains to be understood.

Workshop 4: Evidence Codes: Assay Tracking, ECO and OBI

Date: Monday July 12, 2010

Start Time: 2:30 p.m. - 4:25 p.m.

Room: 311

Moderator(s): Judith Blake, Michelle Giglio, Suzanna Lewis

Overview and Introduction:

Judith Blake, The Jackson Laboratory, Stanford University, Palo Alto, CA, USA

Current Status, Evidence Code Ontology including intersection a with OBI [30 min]

Michelle Giglio, University of Maryland, MD, USA

Moderated Discussion about Evidence Codes, ECO and OBI [60 min]

Suzanna Lewis, Lawrence Berkeley Laboratory, University of California, Berkeley, CA, USA

Summation and Future Directions [20 min]

Program Overview:

This workshop will engage the community of developers and users of evidence codes: that is to say, the terms assigned as a method to alert biological data users as to the type of evidence that forms the basis for an assertion about a given gene product. Assertions about a gene product can be of many varieties including the cellular location of the gene product, its molecular function, its secondary structure or its participation in cellular or multicellular processes. Concurrently, the methods used to develop the assertions can also be of many varieties.

The GO Consortium brought into widespread favor the use of evidence codes such as ‘Inferred from Direct Assay’ or ‘IDA’ to state the type of experimental assay that produced the evidence for the assertion that a given gene product engaged in a specified molecular functionality. Although the GO community uses a small number of evidence codes, other curation groups have worked to extend the evidence code list into an ‘Evidence Code Ontology’, or ‘ECO’, that has grown to be quite extensive.

We will provide updates as to the status of the ECO, and we will seek input as to the future of this resource. Important topics include the relationship of the ECO with the Open Biomedical Investigations (OBI) effort and the resolution of areas of intersection between these two projects. OBI is developing an integrated ontology for the description of biological and clinical investigations. The ECO and OBI groups have been engaged in informal discussions as to the boundaries and overlaps of these two efforts and how the two groups create more synergy. In addition, we will open discussion to emerging efforts to incorporate evidence codes as a component of annotation evaluation and quality control. The Workshop forum at ISMB provides an opportunity for the many interested participants to engage in community assessment and planning for ECO. Both ECO and OBI are members of the Open Biomedical Ontologies community.

Workshop 5: Where and How to Get Published

Date: Tuesday July 13, 2010

Start Time: 10:45 a.m. - 12:40 p.m.

Room: 311

Click here to view the corresponding feed entry in FriendFeed.

Moderator(s): Barb Bryant, Deputy Editor, PLoS Computational Biology and ISCB Publications Committee member.

The Workshop, aimed primarily at ISMB attendees who are in the early stages of their publishing career, will provide counsel and advice from a variety of speakers as well as best practices when preparing and submitting work to some of the most popular journals in our field.

Part I is devoted to general guidelines for writing a good paper and presenting one’s work, using the very popular “Ten Simple Rules” article on the same topic as the launching point (www.ploscompbiol.org/doi/pcbi.0010057). The session will be a mix of practical instruction and interactive discussion led by Philip E. Bourne, Editor in Chief & Founding Editor, and Steven E. Brenner, Founding Editor from PLoS Computational Biology.

Break

Part II will be a panel discussion involving current journal editors, accomplished authors, and first-time submitters, who will provide their perspectives on what journals want and expect. Other topics will include the authorship experience, why (and when) you should publish, how to select the right journal for your work, and the changing post-publication measures of impact. Confirmed speakers include:

Current journal editors: Gary Benson from Nucleic Acids Research, Andrej Sali from Structure, and Alfonso Valencia and Alex Bateman from Bioinformatics

Experienced and accomplished authors: Mark Gerstein and Chris Sander

First time/prospective authors: – two members of the ISCB Student Council

As in Part I, Part II will include brief (10-minute) presentations by the speakers from each category (editors, experienced authors, and new/prospective authors) followed by a 25-minute interactive session involving speakers and attendees as much as possible.

How to write a good scientific research article and select the best journal for your work is something that is not traditionally taught to graduate students and postdoctoral fellows. This Workshop brings together experts who have been through the experience on many occasions as authors and who have responsibility for selecting papers for the major journals in our field. At the conclusion of the Workshop, the attendees should have a clear idea of the elements of a good paper, regardless of the journal, and an understanding of the scope, submission requirements, and what key journals are looking for. The content and discussion in Part II will also appeal to more seasoned scientists seeking to publish their work.