ISMB/ECCB 2017 – Tutorial Program – Friday, July 21

ISMB/ECCB 2017 features half-day tutorial sessions on Friday, July 21, 2017 one day prior to the start of conference scientific program.

Tutorial attendees should register using the on-line registration system. Tutorial participants must be registered for the ISMB/ECCB conference to attend a tutorial. Attendees will receive a Tutorial Entry Pass (ticket) at the time they register on site. Lunch is included in the registration fee for attendees registering for two tutorials. Those attending one tutorial only have the option to purchase a lunch ticket during on-line registration.

Tutorial AM1: Single cell transcriptomics

Room: North Hall
Date:Friday, July 21, 10:00 am - 1:30 pm

 

Presenters:
Anagha Joshi, Division of Developmental Biology, Roslin Institute, University of Edinburgh, United Kingdom
Jeanette Baran-Gale, MRC Institute of Genetics & Molecular Medicine, University of Edinburgh, United Kingdom

After nearly a decade in existence, short-read bulk RNA-sequencing has decidedly gone mainstream, but new technologies keep evolving to reveal ever more intricate aspects of the transcriptional landscape of a cell. Single cell sequencing makes it possible to trace cellular differentiation in minute detail, to study cell-to-cell heterogeneity or to identify rare cell types. Being a recent and currently evolving technique, the data processing and analysis protocols are currently far from standardized. This half a day tutorial session will present recent advances in the development and application of new computational tools, resources and methods to analyze single cell RNA sequencing data highlighting the strengths and weaknesses of these techniques. We will particularly provide a hands-on activity to analyze single cell data generated by smart-seq2 and 10x platforms.

Schedule Overview
Timing Presenter Topic Area/Activity Description
10:00 am – 10:30 am   Introduction to single cell technologies
10:30 am – 11:30 am   Hands-on session – Smart-seq2 data analysis
11:30 am – 11:45 am Break
11: 45 am – 1:00 pm   Hands-on session – 10x data analysis
1:00 pm – 1:30 pm   Discussion and conclusion

Participant Overview:
Beginner or Intermediate

The target audience are researchers who have recently started working on or plan to work in near future with single cell data (Beginner or Intermediate), as well as anyone who is working with large scale genome-wide data and wants to know more about the opportunities and challenges presented by these new data (Broad Interest).

Class Size: 30

Presenter Bios:
Anagha Joshi, Roslin Institute, University of Edinburgh, United Kingdom
Jeanette Baran-Gale, MRC Institute of Genetics & Molecular Medicine, University of Edinburgh, United Kingdom Jeanette Baran-Gale is a postdoctoral research fellow in the lab of Chris Ponting at the MRC Institute of Genetics & Molecular Medicine, University of Edinburgh. Her current research focuses on investigating the mechanisms underlying promiscuous gene expression in thymic epithelial cells using single cell RNAseq. Her past research includes high-throughput analysis of both coding and non-coding RNAs in several disease models including the estrogen response in breast cancer.

Email: This email address is being protected from spambots. You need JavaScript enabled to view it.


Tutorial AM2: Ontologies in Computational Biology 

Room: Terrace 1
Date: Friday, July 21, 10:00 am - 1:30 pm

 

Presenters:
Dr. Michel Dumontier, Maastricht University, Netherlands
Dr. Robert Hoehndorf, King Abdullah University of Science and Technology, Saudi Arabia

Tutorial Overview:

Ontologies have long provided a core foundation in the organization of biomedical entities, their attributes, and their relationships. With over 500 biomedical ontologies currently available there are a number of new and exciting new opportunities emerging in using ontologies for large scale data sharing and data analysis. This tutorial will help you understand what ontologies are and how they are being used in computational biology and bioinformatics.

Schedule Overview
Timing Presenter Topic Area/Activity Description
10:00-10:45   Introduction to ontologies
10:45-11:30   Ontologies and biological data: annotation and text mining
11:30 am – 11:45 am Break
11:45 am - 12:30 pm   Ontology-based data analysis: gene set enrichment and semantic similarity
12:30 pm - 1:00 pm   Understanding ontologies and axioms through automated reasoning
1:00 pm - 1:30 pm   Ontologies and big data

Participant Overview
The tutorial will be of interest to any researcher who will use or produce large structured datasets in computational biology. The tutorial will be at an introductory level, but will also describe current research directions and challenges that will be of broad interest to researchers in computational biology.

Presenter Bios:
Dr. Michel Dumontier, Maastricht University, Netherlands
Dr. Robert Hoehndorf, King Abdullah University of Science and Technology, Saudi Arabia


Tutorial AM3: 3D Genome Data Processing, Analysis, and Visualization Tutorial 

Room: Meeting Hall V
Date: Friday, July 21, 10:00 am - 1:30 pm

 

Presenters:
Nezar Abdennur, PhD student, MIT, United States
Nils Gehlenborg, Harvard Medical School, United States
Peter Kerpedjiev, Harvard Medical School, United States
Soo Lee, Harvard Medical School, United States
Jian Ma, Carnegie Mellon University, United States

Tutorial Overview:

Motivation: Due in large part to the explanatory power of chromosome organization in gene regulation, its association with disease and disorder as well as the unanswered questions regarding the mechanisms behind its maintenance and function, the 3D structure and function of the genome are becoming increasingly target of scientific scrutiny. With efforts such as the 4D Nucleome Project and ENCODE 4 already beginning to generate large amounts of data, the ability to analyze and visualize it will be a valuable asset to any computational biologist tasked with interpretation of experimental results.

Goals and Objectives: After the workshop participants should be able to obtain, process, analyze, and visualize 3D genome data on their own as well as to understand some of the logic, motivation and pitfalls associated with common operations such as Hi-matrix balancing and multi-resolution visualization. Specifically, our objectives include: • To introduce the theoretical concepts related to 3D genome data analysis
• To familiarize participants with the data types, analysis pipeline, and common tools for analysis and visualization of 3D genome data
• To provide a hands on experience in data analysis by walking through some common use cases of existing tools for data analysis and visualization.

Schedule Overview
Timing Presenter Topic Area/Activity Description
10:00 - 10:15   Introduction and Overview
10:15 - 11:30   Hi-C Analysis Intro to Hi-C
3D genomic features revealed by Hi-C analysis
  • 30 minutes:
    • Cis vs trans and scaling (contact probability vs genomic distance)
    • Compartments, TADs, loops
    • Practical: perform basic analysis using Jupyter notebook and Python as time permits
11:30 am – 11:45 am Coffee Break
11:45 - 12:45   Visualization Existing tools for contact matrix exploration
  • 20 minutes
    • 3D genome browser
    • WashU epigenome browser
    • Juicebox
    • HiGlass
Using HiGlass (http://higlass.io) to display contact maps
  • 20 minutes:
    • Overview of common operations such as adding tracks, removing tracks, adding views, removing view, linking views by zoom and location
    • Practical: Explore interesting loci and create interactive versions of static figures from notable papers
  • 20 minutes: Installing HiGlass
    • Overview of the HiGlass architecture and description of the infrastructure used to run it
    • Practical: Create a local HiGlass instance; Convert a contact map to multi-resolution format and import it; Convert a bigWig file to hitile format and import it; Open both files in the client and navigate to an interesting location
12:45 - 1:30   Data Analysis for Nuclear Compartmentalization
  • Introduction
  • DamID analysis
  • Repli-seq analysis
  • Data from emerging technologies

Participant Overview:
The subject matter and practical exercises presented in this tutorial will be accessible to a broad audience. Prior experience with next-generation sequencing and the data it produces will be helpful for understanding the subsequent processing steps used to derive contact maps as well as some of the artifacts that can arise during data processing. The material will be most useful to computational biologists and biologists working on genomics-related topics.

Presenter Bios:
Nezar Abdennur, PhD student, MIT, United States
Nils Gehlenborg, Harvard Medical School, United States
Peter Kerpedjiev, Harvard Medical School, United States
Soo Lee, Harvard Medical School, United States
Jian Ma, Carnegie Mellon University, United States


Tutorial PM4: Network Analysis in Cytoscape: Advanced Topics 

Room: Meeting Hall V
Date: Friday, July 21, 2:30 pm - 6:00 pm

Presenters:
Alexander Pico, Gladstone Institutes, San Francisco, United States
John “Scooter” Morris, University of California, San Francisco, United States
Barry Demchak, University of California, San Diego, United States
Adam Treister , Gladstone Institutes, San Francisco, United States

Intended audience

The Advanced Topics tutorial is intended for an audience that has prior experience with at least one of the following:
• Cytoscape software
• Data integration and analytical methods
• Network biology concepts
• Bioinformatics analysis pipelines

Tutorial Overview:

By the end of tutorial, you should be able to:
• Know when and how to use Cytoscape in your research area
• Identify and discriminate relevant source of interactions, networks and datasets
• Command programmatic control over Cytoscape
• Integrate Cytoscape into your bioinformatics pipelines
• Publish, share and export networks
• Generalize network analysis methods to multiple problem domains

Schedule Overview
Timing Presenter Topic Area/Activity Description
2:30 pm - 2:50 pm   Introductory (20 min)
  • Quick introductions: presenters & audience
  • General network biology overview
  • Cytoscape intro (consortium/history)
    • Roadmap: theme of integration/protocol
    • 3.5/3.6 features
  • Introduce tutorial protocol
    • Exploring TCGA expression and mutation data on disease networks
2:50 pm - 3:30 pm – Intermediate (40 min)  
  • Getting relevant networks
    • Types of networks, sources, and relevant apps
    • o How to choose a network source: String, GeneMANIA, NDEx, WikiPathways, Pathway Commons, etc.
  • Network visualization overview
    • Style mappers & layouts
    • Apps: enhancedGraphics, etc
  • Network analysis overview
    • clusterMaker, BiNGO/ClueGO, etc
3:30 pm - 4:00 pm – Advanced (30 min)  
  • Driving Cytoscape from R
    • Overview of Cytoscape Automation
    • Setup RStudio and install packages
    • Launch Cytoscape and connect from R
  • Getting Disease Networks
    • Query STRING database from R via CyREST
4:00 pm - 4:15 pm - Coffee break
4:15 pm - 6:00 pm – Advanced (105 min)  
  • Interacting with Cytoscape
    • CyREST and Commands
    • RCy3 package
  • Visualizing data on networks
    • Loading multiple data types into Cytoscape
    • Setting visual styles
  • Subnetwork selection
    • Data-driven and diffusion-based subnetworks
  • Saving, sharing and publishing
    • Session files, images and web export
  • Additional topics and resources
    • Python examples
    • Advanced Cytoscape command scripting (loops and args)
    • CyBrowser (new!)

Participant Overview:
The Advanced Topics tutorial is intended for an audience that has prior experience with at least one of the following:
• Cytoscape software
• Data integration and analytical methods
• Network biology concepts
• Bioinformatics analysis pipelines
• Please bring your laptop to this session

Presenter Bios:
Alexander Pico, Gladstone Institutes, San Francisco, United States Alexander Pico is the Executive Director of the National Resource for Network Biology, the Vice President of the Cytoscape Consortium, and Associate Director of Bioinformatics at Gladstone Institutes. He has been a contributing member to Cytoscape since 2006 and has led numerous Cytoscape and Network Biology workshops and mentoring programs over the past 10 years.
John “Scooter” Morris, University of California, San Francisco, United States John “Scooter” Morris is the Executive Director of the Resource for Biocomputing, Visualization, and Informatics at UCSF, the “Roving Engineer” for Cytoscape, and an Adjunct Assistant Professor of Pharmaceutical Chemistry at UCSF. He has given numerous presentations on using and extending Cytoscape and is a Cytoscape core developer as well as the developer of over a dozen Cytoscape apps, including chemViz, structureViz, clusterMaker, and cddApp.
Barry Demchak, University of California, San Diego, United States Barry Demchak is the Chief Architect of Cytoscape, Secretary/Treasurer of the Cytoscape Consortium and Project Manager in the Ideker lab at UCSD. He has been a contributing member to Cytoscape development since 2012 and has led numerous Cytoscape and Network Biology workshops and mentored projects over the past 5 years.
Adam Treister, Bioinformatics Core, Gladstone Institutes, San Francisco, United States is a Senior Software Engineer with core, app and automation development experience. At the Bioinformatics Core at Gladstone Institutes, he performs software design and implementation for Network Biology applications, primarily around Cytoscape in its myriad forms


Tutorial PM5: Prediction of Regulatory Networks from Expression and Chromatin Data 

 

Room: Terrace 1
Date: Friday, July 21, 2:30 pm – 6:00 pm

Presenters
Ivan G. Costa, RWTH Aachen University, Germany
Marcel Schulz, Saarland University & Max Planck Institute for Informatics, Germany
Matthias Heinig, Helmholtz Center Munich, Germany

One of the main molecular mechanisms controlling the temporal and spatial expression of genes is transcriptional regulation. In this process, transcription factors (TFs) bind to the promoter and enhancers in the vicinity of a gene to recruit (or block) the transcriptional machinery and start gene expression. Inference of gene regulatory networks, i.e. factors controlling the expression of a particular gene, is a key challenge when studying development and disease progression. The availability of different experimental assays (Histone ChIP-seq, Dnase1-seq, ATAC-seq, NOME-seq etc.) that allow to map in-vivo chromatin dynamics and gene expression (RNA-seq), has triggered the development of novel computational modelling approaches for accurate prediction of TF binding and activity by integrating these diverse epigenomic datasets. However, in practice researchers are faced with the problems that come with handling diverse assays, understanding the tools involved and building specific workflows that are tailored to the data they have.

This tutorial is targeted to an audience of bioinformaticians with previous experience in gene expression and next generation sequencing analysis. This Intermediary level tutorial will provide you knowledge on the use of state-of-art tools for inference of gene regulatory networks from chromatin and expression data. First, we will review tools to conduct the following analyses: 1) predict regulatory regions from different epigenetic datasets, e.g., using differential peak callers (histoneHMM - Heinig et al., 2015) or footprint methods (HINT - Gusmao et al., 2014) and 2) show how to determine cell-specific TF binding in these regions (e.g. TEPIC - Schmidt et al. 2016) and 3) build regulatory networks to study a cell type of interest (e.g. Schmidt et al. 2016, Durek et al. 2016). After introductory presentation we will guide participants through a hands on practical. Therefore, we require that all participants bring their own laptop. Software that needs to be installed before the tutorial as well as data used in the tutorial will be made available on the course website, where also more details are announced.

Course Website https://github.com/SchulzLab/EpigenomicsTutorial-ISMB2017

Schedule Overview
Timing Presenter Topic Area/Activity Description
2:30 pm - 2:45 pm Ivan G. Costa Introduction / gene regulation / transcription / chromatin
2:45 pm - 3:00 pm Matthias Heinig Introduction ChIP-seq peak calling
3:00 pm - 3:50 pm Matthias Heinig Practical peak calling
4:00 pm - 4:15 pm Break
4:15 pm - 4:30 pm Ivan G. Costa Introduction Footprints
4:30 pm - 4:45 pm Marcel Schulz Introduction Regulatory networks
4:45 pm - 6:00 pm Ivan G. Costa and Marcel Schulz Practical Regulatory Networks

Participant Overview:
Intermediate level

Presenter Bios:
Ivan G. Costa, RWTH Aachen University, Germany Ivan G. Costa is a group leader at the RWTH Aachen Medical Faculty. His research focus on computational methods for the identification of epigenetic and regulatory mechanisms driving cell differentiation and diseases. Among others, his team has developed methods for detection of cell specific binding sites from DNAse-seq (HINT http://www.regulatory-genomics.org/hint/introduction/), the ChIP-Seq differential peak caller THOR and a computational framework for analysis of regulatory genomics data RGT.
Matthias Heinig, Helmholtz Center Munich, Germany Matthias Heinig is group leader at the Institute of Computational Biology at the Helmholtz Zentrum Munich. His aim is the development and application of computational and statistical tools for the identification of molecular regulatory networks underlying common diseases and the genetic and epigenetic mechanisms controlling these networks from population level DNA and multi-omics data sets. A special focus is the molecular characterization of metabolic and cardiovascular diseases, in particular diabetes and arrhythmias like atrial or ventricular fibrillation.
Marcel Schulz, Saarland University & Max Planck Institute for Informatics, Germany
Marcel Schulz is an independent group leader at Saarland University and the Max Planck Institute for Informatics at the Saarland Informatic Campus in Saarbruecken, Germany. He leads the groups for High-throughout Genomics and Systems Biology and is interested in statistical genomics and gene regulation. He is a member of the integrative analysis working group of the International Human Epigenomics Consortium (IHEC) and a co-editor of the ISCB community journal "Regulatory and Systems Genomics" at F1000 Research.


Tutorial PM6: Making Galaxy Work for You

Room: North Hall
Date: Friday, July 21, 2:30 pm – 6:00 pm

Presenters:
Martin Cech - Department of Biochemistry and Molecular Biology, Penn State University, University Park, United States
John Chilton - Department of Biochemistry and Molecular Biology, Penn State University, University Park, United States
Dr Björn Grüning - Bioinformatics Group, Albert-Ludwigs-Universität Freiburg, Germany

Tutorial Overview:

Galaxy is an open source web-based platform for data analysis. The goal of this tutorial is to provide a practical, hands-on guide to adapting the Galaxy platform to the specific needs of individuals attending the ISMB.

Galaxy is widely deployed and packaging your bioinformatic tools and pipelines for Galaxy is an effective way to expand the audience for your work. Galaxy deployments range in size from single user instances to large public servers serving tens of thousands of researchers. Instances provide a uniform and easy-to-use interface to sophisticated computational resources without requiring users to learn command line interfaces or Linux systems administration skills.

Participants will learn to
• Create Galaxy compatible tool and workflow definitions that are publicly accessible and that make it easy for any instance administrator to add your work to their server.
• Deploy Galaxy and scale it up to target production-ready resources such as a Postgres database, NGINX webserver, and distributed resource managers such SLURM or PBS.

Schedule Overview
Timing Presenter Topic Area/Activity Description
2:30 pm - 2:55 pm   Introduction to Galaxy - From Data Exploration to Workflow Building
2:50 pm - 4:00 pm   Integrating Tools into Galaxy
  • Anatomy of Galaxy Tools & Introduction to [Planemo]
  • Galaxy [Tool Shed]
  • Wrapping a tool [hands-on]
  • Tool Development Q & A
4:00 pm - 4:15 pm Break
4:15 pm - 5:30 pm   Deploying and Scaling Galaxy
  • Deployment and Platform Options
  • Get Basic Server Up and Running [hands-on]
  • Automated Deployment with Ansible [demo
  • Leveraging Compute Clusters [demo]
5:30 pm - 6:00 pm   Q & A, troubleshooting

Tutorial home: https://github.com/galaxyproject/galaxytutorial-ismb17

Participant Overview:
This tutorial is aimed at a broad audience. Some knowledge of the command-line and Unix will be assumed. Parts of the tutorial are hands-on so participants are encouraged to bring their laptops. No previous knowledge of Galaxy or particular topics in bioinformatics is required. Participants should have a desire to learn more about Galaxy, or a desire to learn how to wrap their tools and pipelines in Galaxy to facilitate others using these, or just want specific advice on setting up an instance of Galaxy and scaling it up.

Presenter Bios:
Martin Cech - Department of Biochemistry and Molecular Biology, Penn State University, United States Martin Čech, an alumnus of Masaryk University, a fan of open source projects and communities, software enthusiast, and Galaxy developer in Anton Nekrutenko Lab at Penn State University since 2013. He enjoys training people in various Galaxy topics, untangling project's problems, enhancing UX, and being part of the Galaxy community overall.
John Chilton - Department of Biochemistry and Molecular Biology, Penn State University, United States John Chilton has a Master's degree in computer science and has been a professional software developer working in the field of bioinformatics infrastructure for 10 years. John is a member of the Galaxy team, the creator of several open source projects including Planemo and Pulsar, and one of the co-founders of the Common Workflow Language.
Dr Björn Grüning - Bioinformatics Group, Albert-Ludwigs-Universität Freiburg, Germany Dr Björn Grüning is with the Bioinformatics Group at Albert-Ludwigs-Universität Freiburg, in Freiburg Germany, where he heads the Freiburg Galaxy Project. His publication list includes several papers that feature Galaxy prominently, including the recent “Enhancing pre-defined workflows with ad hoc analytics using Galaxy, Docker and Jupyter” (Grüning, et al, 2016. He is a prominent contributor to, and is a driving force in, the Galaxy community. In the past year alone, he helped organize the Bioconda Contribution Fest, Swiss-German Galaxy Days, the Galaxy Training Materials Contribution Fest, the Galaxy DevOps Workshop, and the Conda Dependencies Codefest, and presented and taught at GCC2016. His research interests include data visualisation, computational chemistry, and drug discovery.