Lunch and Learn Workshop

ISMB introduces Lunch and Learn Workshops. These luncheon events are hosted by select conference sponsors - the workshops are 75 minutes in length and include a hosted lunch. Participants must pre-register with the workshop hosts. Instructions are included below on how to participate.

 

The Appistry Pipeline Challenge: Rewarding Researchers for Translating NGS Data into Clinical Action

Date:Sunday, July 13       12:45 p.m. - 2:00 p.m.        Room:  306

 

Presented By:

 

 

Presenter: Brett McCann, Director of Services, Appistry, Inc.
Many of the easiest problems in NGS analysis have already been solved, which leaves those problems that may require creative use of bioinformatics tools and high-performance computing expertise to solve. The solution may be usable in an individual lab, but for clinical use, pipelines must be production-grade and capable of being used repetitively and reliably on different types of infrastructure. In this session, learn about the Appistry Pipeline Challenge, a competition running from July 7–August 15 that will reward and support one winning proposal for a creative pipeline that will make a difference in clinical research and precision medicine. The winner will receive a complete NGS analysis package valued at $70,000 including bioinformatics tools for variant calling and somatic mutation analysis, software and hardware for developing and executing pipelines at scale, and a year’s worth of support to help researchers turn their ideas into functional, production-grade pipelines. Attend this session to learn more about the competition, the prize package, and how to enter your project.

Special Talks - ISMB 2014


Attention Conference Presenters - please review the Speaker Information Page available here

ST01: Nobel Prize Celebration: Arieh Warshel's Legacy - Presented by Lynn Kamerlin

Presenter: Lynn Kamerlin

Room: 302

Date/Time:  Sunday, July 13 at 11:30 a.m. - 11:55 p.m.

Session Chair: Bonnie Berger

 

 The advent of the first enzyme structures in the 1960s, coupled to increasing computer power at the time, marked a turning point for computational enzymology. Specifically, starting in 1970, a number of different QM+MM and QM/MM approaches were introduced by Warshel and coworkers to facilitate the description of reactions in enzymes. This and molecular dynamics simulations of biological reactions (that also started with Warshel’s work), as well as the development of classical force fields, mark the emergence of multiscale models for chemical reactivity, that allowed us to begin to directly translate structural information into an energetic picture, to better understand enzyme function. In my view the most effective direction to address this problem has been the Warshel’s 1980s “empirical valence bond” approach. Despite its seemingly theoretical simplicity, the empirical valence bond approach remains one of the most powerful tools to understand chemical reactivity in biological systems even today. This talk will explore the theoretical basis and historical background for this approach, and illustrate its application to a number of the most challenging problems in computational enzymology. Additionally, the unimaginable gains in computational power of recent decades have allowed for ever more complex systems to be addressed. Therefore, this talk will conclude by discussing the power of the EVB approach to address 21st Century challenges such as enzyme design, understanding protein evolution, and addressing chemical reactivity in even such big biomolecular systems as GTP hydrolysis on the ribosome.

 

ST02: Nobel Prize Celebration: A personal perspective on Martin Karplus's lasting influence on my career in science

Presenter: Roland Dunbrack

Room: 302

Date/Time: Sunday, July 13, 12:00 p.m. - 12:25 p.m.

Session Chair: Bonnie Berger

 

 In September of 1981, as a freshman at Harvard College, my first class on my first day of college was Martin’s Chemistry 10 course. Over the next four months, Martin proceeded to teach the outline of his influential textbook Atoms and Molecules to freshmen in what was the upper-level intro freshman chemistry course (the other was Chemistry 5 taught by Leonard Nash, who Martin had taken his intro course in chemistry from in 1947). Martin started with the quantum mechanical model of a particle in a box, then one-electron atoms, two-electron atoms, many electron atoms, the hydrogen molecule, other diatomics, triatomics, and the H+H2 reaction. The last lecture of the course he showed a movie of the one of the first molecular dynamics simulations of a protein structure, BPTI. I was astonished that this was possible, and I was hooked by the prospect of being able to understand so much of biology with theoretical and empirical chemistry and physics. I was able to work for him on quantum mechanical treatments of polyenes as an undergraduate and after finally learning some biochemistry at Cambridge after graduating from Harvard in 1985, returned to Harvard for a PhD in biophysics split between Martin and Jack Strominger in the Biochemistry Department. My work in grad school and ever since has been on the statistical end of things, but the motivation as was often the case in Martin’s work, was to solve a biological problem – initially the structure prediction of the many variants of HLA Class I proteins whose first structure was solved in 1987 by Don Wiley and Jack Strominger. I proposed and developed the backbone-dependent rotamer library as a statistical way of solving the side-chain conformation prediction problem for HLA proteins and proteins in general. It remains a central component of many if not most structure prediction and protein design programs. From Martin I learned how important it is to interpret and understand the statistical results in terms of the underlying physical forces. Just as important, I learned how to write up our work in sufficient detail that it can be replicated. I can still hear his voice in my head when I am writing papers, asking me to fill in some important detail to make everything crystal clear and reproducible. For better or worse, I still tend to write long papers because of this. In this talk, I will review our work on statistical functions of the Ramachandran map variables – density estimates, classification functions, and finally regression functions on the Ramachandran variables. Our recent regressions of bond angles of the main chain and side chains in very high-resolution structures (better than 1.0 Å) have identified aspects of current potentials that accurately reflect the high-resolution structures and areas for further improvement – indicating that some physical properties of proteins are not yet accurately modeled by current empirical force fields used in CHARMM and other programs.

Industry Posters - ISMB 2014

IP01 - Accurate Structural Variant Detection and Utilization in Comprehensive Clinical Interpretation
Scientific Area: Genetic Variation Analysis

Presenting author: Ming Li, Personalis, United States


Additional authors:
Stephen Chervitz, Personalis, United States
Daniel Newburger, Personalis, United States
Sarah Garcia, Personalis, United States
Gemma Chandratillake, Personalis, United States
Michael Clark, Personalis, United States
Nan Leng, Personalis, United States
Jason Harris, Personalis, United States
Mark Pratt, Personalis, United States
Michael Snyder, Personalis, United States
John West, Personalis, United States
Richard Chen, Personalis, United States

Presentation Overview: Show/Hide

 

Genomic structural variants (SVs) – inversions, translocations, deletions, and duplications, play an important role in understanding genetic. It is a challenging task to accurately detect and characterize SVs in genomic sequence data. Here we present an approach to integrate orthogonal algorithms with targeted local reassembly to improve SV detection performance. Also we determine the genomic context, zygosity and exact breakpoints of the SVs when possible. Identified SVs are annotated and ranked based on biomedical relevance and predicted likelihood of causing disease using public and proprietary databases.

The performance of our SV detection approach was assessed by analyzing deletions from both simulated and experimental genome sequencing data. With simulated data at approximately 46X coverage, the sensitivity and FDR were 96.3% and 1.4% respectively, compared to 55.6% and 27.6% average for the SV detection methods used independently. With experimental sequencing data for a trio, a gold standard SV set is constructed and vetted by pedigree consistency. The average sensitivity for SV detection on this data was 96.8% and the FDR was 1.4%, consistent with the results from simulation.

We demonstrated the utility of our SV calls for medical interpretation by using our method to identify, annotate and prioritize SVs in samples known to harbor pathogenic SVs. Utilizing our knowledge-based ranking system for disease variant discovery, we demonstrate our ability to integrate SVs with SNVs and indels to correctly detect a known, causative compound heterozygous mutation in the ATM gene.

TOP

 

IP02 - Variant detection in tumor samples through PCR-based enrichment and Next-Generation Sequencing
Scientific Area: Bioinformatics of Disease and Treatment

Presenting author: Sivakumar Gowrisankar, Novartis Institutes for BioMedical Research, United States


Additional authors:
Zachary Zwirko, Novartis Institutes for BioMedical Research, United States
Vera Ruda, Novartis Institutes for BioMedical Research, United States
Yanqun Wang, Novartis Institutes for BioMedical Research, United States
Oleg Iartchouk, Novartis Institutes for BioMedical Research, United States

Presentation Overview: Show/Hide

 

High-throughput genetic profiling of tumor tissues especially those that are formalin fixed and paraffin embedded (FFPE) are highly limited by sensitivity and specificity of assays. This has been due to a wide array of issues such as low DNA starting material, DNA degradation, tumor heterogeneity to name a few. Several methods have been proposed to profile mutations within tumor samples such as the targeted hybrid capture and PCR-based amplicons enrichment. Hybridization based approaches have the caveat of requiring higher input starting material and complicated workflows. Most PCR-based approaches have been known to suffer from high false positives due to the inability to remove PCR-duplicates. On the other hand whole-genome and exome sequencing are still prohibitively expensive to employ on large-scale studies to characterize tumor samples.

We here present a tumor profiling approach based on PCR-based amplification of selected genes followed by next-generation sequencing. We first randomly barcode PCR-products by adaptor ligation, followed by PCR-amplification and subsequent sequencing. This approach has the distinct advantages of requiring lower DNA starting material, simple workflow, and ability to distinguish PCR-duplicates. In addition the uniquely barcoded reads can be used to reduce false positives. The high correlation of read distribution between tumor-normal or tumor-resistant tumor samples yield itself to reliable copy number variant (CNV) detection. In this poster we provide the results on sensitivity, specificity and CNV detection on 24 paired and pooled control samples to demonstrate the utility of this approach.



TOP

IP03 - A publication model that aligns with the key Open Source Software principles

Presenting author: Michael Markie, F1000Research, United Kingdom
Presentation Overview: Show/Hide

 

In recent years, software development has had a significant impact on scientific research and continues to play a major role in facilitating advances with the life sciences in particular. Building code using open repositories such as GitHub allows it to be continually improved both during the development phase and after the software has been more widely disseminated. However, the long term availability of code is important in order to be reproducible, and to enable future scientific research which may require further modification of existing code1. Documentation of code for scholarly purposes usually takes the form of a publication in a peer reviewed article. This allows the developer to provide context around their code for both fellow programmers and non-computational users. A published paper also contributes to the developer’s formal academic output but also helps foster vibrant collaborative communities that help nurture and spread new ideas as well as reinforcing the quality of the code that is produced.

Releasing information in incremental steps is nothing new to software developers, who regularly release updates and patches that add new functionality to existing programmes. The launch of a new bioinformatics tool is often accompanied by a paper describing the software for new users. However, the paper describing the tool will be out-of-date as soon as a new software update is released but the changes are often not significant enough to warrant a whole new paper, and thus the most recent developments go undocumented for a sustained period of time. Trying to publish such dynamic information in traditional ‘static’ journals is much like fitting a square peg in a round hole.

The F1000Research (http://f1000research.com/) publishing model is much more in synch with the way software is developed. Each software tool published can be updated at any time as a new version (clearly linked to the original and previous versions of the article) allowing any new code, tweaks and features to be documented with relative ease. Furthermore, F1000Research ensures that all the code and related data are freely available from the paper. A usable copy of the code as it was at the time of publication remains available, with the code being forked into an archival F1000Research space within the same repository used by the authors. A copy of the code as at the time of publication is also assigned a persistent identifier to eliminate any ambiguity about the code that is described in the article. Additionally, F1000Research ensures the paper includes a link to the author’s own working repository, so that readers can easily navigate to the latest version of the source code. By taking these measures, users are able to establish the provenance of the code and reuse it easily, hence supporting the reproducibility of the software, which ultimately contributes to making the software more robust. F1000Research also uses open peer review, providing an additional layer of validation for published software articles. Experts from the scientific community are invited to constructively critique the software and lay the foundations for any improvements. Having these reviews, together with any user comments, open to everyone helps to mirror the collaborative approach encouraged by open source initiatives and embraces the open source community' ethos.

By aligning with the requirements of publishing software, F1000Research has started to encourage computational science software developers to create an F1000Research Article Collection to augment their open source software projects. In February 2014, we launched the BioJS Collection which comprises individual software components, each of which are like a standard Lego-like pieces for building web applications that visualise biological data4.

With this poster, we will discuss the novel requirements associated specifically with the needs of articles associated with open source software development, and discuss new publishing opportunities that better reflect and support those needs for the benefit of both software developers and scientific researchers as a whole.

IP04 - Analysis of 8,000 cancer exomes from the Oncomine® Knowledge Base to identify NFE2L2 pathway as a novel therapeutic opportunity in multiple cancer types.
Scientific Area: Bioinformatics of Disease and Treatment

Presenting author: Nickolay Khazanov, Thermo Fisher Scientific, United States


Additional authors:
Sean Eddy, Thermo Fisher Scientific, United States
Marry Ellen, Thermo Fisher Scientific, United States
Jia Li, Thermo Fisher Scientific, United States
Mark Tomilo, Thermo Fisher Scientific, United States
Dinesh Cyanam, Thermo Fisher Scientific, United States
Armand Bankhead, Thermo Fisher Scientific, United States
Sarah Anstead, Thermo Fisher Scientific, United States
Nikki Bonnevich, Thermo Fisher Scientific, United States
Becky Steck, Thermo Fisher Scientific, United States
Peter Wyngaard, Thermo Fisher Scientific, United States
Seth Sadis Thermo Fisher Scientific, United States
Emma Bowden Thermo Fisher Scientific, United States
Bryan Johnson Thermo Fisher Scientific, United States
Dan Rhodes Thermo Fisher Scientific, United States

Presentation Overview: Show/Hide

 

To reduce late-stage drug attrition in oncology, it is critical to identify appropriate drug targets and pre-clinical models. NGS analysis of cancer exomes provides a comprehensive assessment of alterations; however discerning rare driver events from abundant passenger aberrations remains a challenge. To maximize the value of NGS, it is imperative to delineate the driver alterations and annotate them for clinical relevance.

Here we present our framework for mining the multi-dimensional NGS data in the Oncomine® Knowledge Base for candidate driver lesions across dozens of cancer types and candidate drug targets. An integrative framework was designed to compute associations among driver mutations, fusions and copy alterations to define the driver aberration landscape of common cancers, then correlate the drivers to clinical metadata. Genes were ranked through associations with patient survival, and potential clinical actionability.

We verified the majority of known driver genes across samples from major cancer types, and nominated novel infrequently altered potential drivers. We found strong evidence implicating NFE2L2 as an oncogene. Recurrent NFE2L2 mutations were found in samples from multiple cancer types and associated with poor outcome in head and neck squamous cell carcinoma. We also investigated KEAP1, a repressor of NFE2L2 activity. Mutations in KEAP1 tended to localize within the NFE2L2 binding domains and did not co-occur with NFE2L2 recurrent mutations. Genes up-regulated in NFE2L2 or KEAP1 mutant samples significantly associated with genes up-regulated in chemotherapy-resistant cell lines. Using cell line exome data we were also able to identify cell lines representative of samples from clinical populations containing the significant mutations.

TOP

 

Analysis of 8,000 cancer exomes from the Oncomine® Knowledge Base to identify NFE2L2 pathway as a novel therapeutic opportunity in multiple cancer types.

Birds of a Feather (BoF)  - ISMB 2014

Sunday, July 13

BOF: Open Source Communities with Impact, Leader: Manuel Corpas (Room 302)

BOF: BioFabric BoF: (nodes == lines) -> !hairballs, Leader: Bill Longabaugh (Room 313)

 

Tuesday, July 15

BoF: Bioinformatics Curriculum Guidelines, Leader: Lonnie Welch (Room 302)
BoF: Career Development for Women in Science, Leader: Lucia Peixoto (Room 304)
BoF: Critical Assessment of Function Annotation followup, Leader: Iddo Friedberg (Room 306)

 

Topic: Open Source Communities with Impact

Leader:  Manuel Corpas
Affililiation: The Genome Analysis Centre
Date: Sunday July 13, 2014 12:45 p.m. - 1:45 p.m.
Room:
302

Description:
Many bioinformatics initiatives rely heavily on distributed communities of scientists and developers. What makes these communities successful? How can we harness their energy to develop scientific impact? While a compelling vision for the project is critical, effective open source communities must be able to cope with the diverse needs and demands of its members. Understanding the dynamics of remote collaborative interactions between community members is key to its success. In this BoF we will dissect the social engineering factors influencing the impact of biologically-inspired open source communities. Specifically we will focus on (i) the benefits/motivations for bioinformaticians participating in open collaborative projects and (ii) the features of higher and lower-impact communities in our field.

 


Topic: BioFabric BoF: (nodes == lines) -> !hairballs

Leader:Bill Longabaugh
Affililiation: Institute for Systems Biology
Date: Sunday July 13, 2014 12:45 p.m. - 1:45 p.m.
Room:
313

Description:
BioFabric (www.BioFabric.org) is a new network visualization tool that represents nodes as lines instead of as points, which creates highly organized, unambiguous, and scalable node-link diagrams. This BoF will provide users, potential users, and just the nodes-as-lines-curious to explore and discuss how BioFabric can help you to visualize your network data.

 

 


 

Topic: Bioinformatics Curriculum Guidelines  (An Open Forum of the Curriculum Task Force of the ISCB Education Committee)

Leader: Lonnie Welch
Email: This email address is being protected from spambots. You need JavaScript enabled to view it.
Affililiation: Ohio University
Date: Tuesday July 15, 2014 12:45 p.m. - 1:45 p.m.
Room: 302

Description:
The Curriculum Task Force of the ISCB Education Committee will hold an open forum to discuss its recent report “Bioinformatics Curriculum Guidelines: Toward a Definition of Core Competencies” (see PLOS Computational Biology, March 2014).  The discussion will focus on implementation and refinement of the guidelines.



 

Topic: Career Development for Women in Science

Leader: Lucia Peixoto
Email: This email address is being protected from spambots. You need JavaScript enabled to view it.
Affililiation: University of Pennsylvania
Date: Tuesday July 15, 2014 12:45 p.m. - 1:45 p.m.
Room: 304

Description:
For many decades, an increasing number of women have obtained science doctoral degrees, however, women, continue to be significantly underrepresented in almost all leadership positions.  While the degree of underrepresentation varies among disciplines, women's advancement to senior professorial ranks and leadership roles is an issue in all fields. Our computational biology research community is young and growing, and thus has an opportunity to set new standards in unbiased leadership, promotion, and recognition of accomplishments.  Toward this, we will discuss the current state of implicit bias in hiring and promotion practices in science and outline strategies to increase career success, job satisfaction and work-life balance regardless of gender.
 

- Understanding Implicit Bias. Terry Gaasterland, UCSD

- Increasing your chances of success:

  * What makes a job candidate stand out. Perspectives from young group leaders and the people who hire them.
Yana Bromberg, Rutgers.


* What is an "Individualized Career Development Plan" (IDP) and why it is important to have one. Michael Robinson, CHOP.
*Success factors beyond science: Jill Mesirov (BROAD), Fran Lewitter (MIT), Pankaj Agarwal (GSK)

- Open discussion

 


 
Topic:
Critical Assessment of Function Annotation followup
Leader: Iddo Friedberg
Email: This email address is being protected from spambots. You need JavaScript enabled to view it.
Affililiation: The Genome Analysis Centre
Date: Tuesday July 15, 2014 12:45 p.m. - 1:45 p.m.
Room: 306

Description:
Fewer than 2% of protein sequences are annotated manually, and fewer than 1% by experiments. With the advent of the $1000 genome, the analysis typically costs over $50,000. The Critical Assessment of Function Annotation (CAFA) is an ongoing effort to assess and improve computational function prediction methods. CAFA 2014 was highly successful, engaging 50 groups from 20 countries. We are looking to engage more people in the next CAFA as predictors, assessors and judges. This is a great opportunity to join a large international effort and learn about cutting-edge technologies which are used in gene and genome annotation. Funding opportunities will also be discussed, with  a program officer from the National Science Foundation.

 


 

 

Institute for Systems Biology

ISMB 2014 - High School Teacher Workshop

Starts: Friday July 11, 2014 9am to 3pm Eastern

Location: The Hynes Convention Center in Boston

Registration has closed please contact Nadine at This email address is being protected from spambots. You need JavaScript enabled to view it. if you have any questions. 

 

 

ISCB High School Teacher Workshop

Friday July 11, 2014

9am to 3pm Eastern

The Hynes Convention Center in Boston

 

The International Society for Computational Biology (ISCB) is hosting a hands-on workshop this July. This workshop will be taught by Dr. Fran Lewitter, Founding Director of Bioinformatics and Research Computing at Whitehead Institute and Dr. David Form, biology teacher at Nashoba Region High School. Laptops will be provided for the workshop.

 

The workshop will include bioinformatics activities that can be used in your classroom to help students learn biological principles. Topics include BLAST and other resources available at NCBI (the National Center for Biotechnology Information).

 

  • Laptops will be provided for the workshop.
  • There is no charge to teachers for this workshop.