Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

BIOINFO-CORE COSI

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in UTC
Friday, July 30th
11:00-11:10
An alien in a hospital data center: solitary management of a bioinformatics platform
Format: Pre-recorded with live Q&A

Moderator(s): Madelaine Gogol

  • Nicole Scherer

Presentation Overview: Show

The bioinformatics platform of the Brazilian National Cancer Institute (INCA) was created in 2013 with the acquisition of a small HPC cluster. It resides inside the institute's data center, but it is not managed by the IT team, although they provide us with infrastructural support. Some features of our scientific computing cluster include the Linux operating system, independent user management, differential network traffic, open source software, in-house pipelines, bulky datasets, and inexperienced users (mostly). The bioinformatician with no formal training in systems administration was left with the challenge, responsibility and freedom of designing from scratch a management plan for the brand new equipment. I will talk about semi-automatic projects and user registration, user agreement, quota enforcement, shared bioinformatic tools and databases, virtual environments, CLI and web-based remote access.

11:10-11:20
Accelerating the Velocity of Team Data Science with Research Project Management
Format: Pre-recorded with live Q&A

Moderator(s): Madelaine Gogol

  • Gregg TeHennepe

Presentation Overview: Show

Like all complex endeavors involving teams of people working towards shared goals, the activities of bioinformatics research require planning and organization if significant new discoveries are to be found. The scientific method lays out the principles of approaching hypotheses and searching for answers to the questions they raise, however it does not provide a more detailed roadmap of the organizational processes needed for success. Research in business environments such as biotech and pharmaceutical companies benefits from the business motivations of the company as well as the culture and practices of the business world. Research in more academic environments such as universities and non-profit laboratories is frequently siloed around the interests of individuals scientists and smaller collaborations, and is often unaware of the working practices of the business world. These environments typically emphasize the intellectual freedom of the academic space, and prefer more informal environments when it comes to managing work. Such environments stand to benefit significantly from the relatively simple approaches of Research Project Management that have become essential to the success of diverse teams, which otherwise struggle with communication, prioritization, execution, and delivery of complex, data-intensive projects. This talk focuses applying the methods and tools of agile project management to bioinformatics research to provide the benefits of project management found in other fields while retaining those elements critical to the mission and process of academic research. The talk will cover our development and application of this method, sharing insights and lessons learned from multiple projects over a four-year period.

11:20-11:30
The Bioinformatics Research Support Network at Agriculture and Agri-Food Canada
Format: Pre-recorded with live Q&A

Moderator(s): Madelaine Gogol

  • Fatima Mitterboeck, Agriculture and Agri-Food Canada, Canada

Presentation Overview: Show

In September 2018 a new bioinformatics research support network initiative was launched by the federal department of Agriculture and Agri-Food Canada. The purpose of this network is to provide bioinformatics and big data support to agricultural research programs in research and development centres across Canada in a diverse range of areas. The network launched with the appointment of 5 bioinformaticians and has grown to over 20 members in 13 support units across Canada. The pillars of the network are collaboration, education and training, and infrastructure. This support network operates at local, national, and international levels. Regional units within our organization collaborate and coordinate with each other at a national level and with other Government of Canada and non-government institutions. At the international level, we aim to connect with fellow support unit managers to share insights gained. This talk will discuss the progress of this network as well as future directions and challenges of the program.

11:30-11:40
Building up a bioinformatics community at the Dutch institute for ecology
Format: Pre-recorded with live Q&A

Moderator(s): Madelaine Gogol

  • Fleur Gawehns, NIOO-KNAW/KWS Vegetables, the Netherlands

Presentation Overview: Show

NIOO-KNAW is the Dutch Institute of Ecology, which combines microbial, terrestrial, aquatic and animal ecology research; fields that are very data intensive, also in terms of high throughput sequencing projects. To support the researchers in performing their bioinformatics analysis, NIOO-KNAW hosts one centralized bioinformatics unit (BU), which closely connects with the in-house ICT and library departments. The BU organizes the bioinformatics infrastructure, data management, courses and one-on-one supervision in such a way that researchers can perform their analyses independently. Pipelines are developed in collaborations inside and outside of the institute. An example is epiGBS2, a snakemake based workflow for the determination of cytosine methylation without a preexisting reference genome. Here, I will show our lessons learned during the epiGBS2 development and the strength and pitfalls of the BU set-up.

11:40-11:50
Embracing Advances in Machine Learning & Imaging for Biomedical Research
Format: Pre-recorded with live Q&A

Moderator(s): Madelaine Gogol

  • Krishna Karuturi, The Jackson Laboratory, United States

Presentation Overview: Show

Machine Learning (ML) and Imaging, in combination with massive heterogeneous omics data, have been increasingly adopted for biomedical research to predict biological features and to identify biological patterns. The applications include identifying genomic features such as enhancers and TADs, spatial transcriptomic analysis, cell-2-cell communication analysis, mouse phenotyping, tumor sample phenotyping, and translational research by combining the massive omics and imaging data. Besides the availability of massive heterogeneous datasets, this increasing adoption of ML & Imaging is thrusted by the tremendous advances in tools, computing technologies & platforms, and engineering. At the Computational Sciences of The Jackson Laboratory, we are embracing these very advances to address important data-intensive complex biomedical problems of interest to our faculty collaborators by harnessing science, engineering, and technology in close partnership with our IT and supported by our community-integration and team datascience approaches. In this talk, I will present the comprehensive approach we have taken, the recent advances we made, and the planning that is shaping up to further our capabilities with efficiency and impact.

11:50-12:00
RiboSeeker: An End-to-End Package for Ribosome Profiling Data Analysis
Format: Pre-recorded with live Q&A

Moderator(s): Madelaine Gogol

  • Ning Zhang

Presentation Overview: Show

Ribosome profiling is a technology for determining translation activity by sequencing ribosome protected mRNA fragments. We developed RiboSeeker, an end-to-end package for ribosome profiling data analysis. RiboSeeker consists of two main components: data processing and downstream analysis at single nucleotide resolution. The first component was written in Snakemake, a workflow management language, for scalable and reproducible data processing. Taking raw FASTQ reads, the workflow performs adapter sequence trimming, RNA contamination removal, and alignment to a reference genome. Next, starting from an alignment file, a variety of downstream analysis and visualization functionalities were implemented in R, such as read length distribution, genomic feature distribution of aligned reads, and metagene plots. Based on the metagene plots, users can define the desired read length and P-site offset, and further measure the translation of open reading frames (ORFs) by calculating translation efficiency and discover novel ORFs by computing ORFscore. Altogether, RiboSeeker covers the essential data processing and analysis steps for ribosome profiling experiments.

12:40-13:20
Reproducible, adaptable, transparent, and composable data analyses with Snakemake
Format: Pre-recorded with live Q&A

Moderator(s): Rodrigo Polo

  • Johannes Köster, University of Duisburg-Essen, Germany

Presentation Overview: Show

With on average over 6 new citations per week in 2020, Snakemake is one of the most widely used frameworks for reproducible data analysis. With a new approach for modularization and deployment, an overhauled scheduling algorithm, a caching mechanism that spans projects and users, Jupyter notebook integration, and easily configurable graph partitioning, Snakemake has recently gained several new features that further increase reproducibility, adaptability, transparency, and composability of data analyses, which will be presented in this talk.

13:20-13:30
Breakout room setup
Format: Live-stream

Moderator(s): Rodrigo Polo

13:30-14:00
Breakout room discussion 1
Format: Live-stream

Moderator(s): Rodrigo Polo

14:20-14:30
Breakout room report 1
Format: Live-stream

Moderator(s): Rodrigo Polo

14:30-15:00
Breakout room discussion 2
Format: Live-stream

Moderator(s): Rodrigo Polo

15:00-15:20
Breakout room report 2
Format: Live-stream

Moderator(s): Rodrigo Polo



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube