Organized by the ISCB Education Committee:
Dr. Annette McGrath is a principal research scientist at CSIRO Data61, Australia. She has been actively involved in developing and delivering national bioinformatics training programs in Australia. She is an executive on GOBLET, ABACBS and a member of the ISCB Education committee.
Dr. Michelle D. Brazas is the Senior Program Manager for Adaptive Oncology at the Ontario Institute for Cancer Research. She was previously the lead for the Canadian Bioinformatics Workshops (bioinformatics.ca) and Manager of Bioinformatics Education at OICR. She is also an executive on GOBLET and a member of the ISCB Board of Directors and Education committees.
In recent years, there has been a growing focus on the necessity of making published research data easier to discover and reuse for subsequent analyses by other researchers. This is not limited to the life sciences. Open data sharing is a core principle of many public research funding bodies worldwide. Discovery and accessibility of research data is essential to enable others to perform subsequent downstream analyses and integration of data. This means that research data can generate value in the research community for research far beyond the original author lab and focus. International efforts have recently culminated in the publication of the FAIR DATA principles in 2016. FAIR stands for “Findable, Accessible, Interoperable and Reusable”. These principles act as guidelines for best practices in data stewardship for those who wish to enhance the discoverability and reusability of their research data. These principles have received worldwide recognition by organisations such as FORCE11, NIH, ELIXIR and the European Commission as a useful framework to maximise data sharing and use and reuse.
With the steep drop in the cost of generating data, life scientists are generating ever increasing amounts of data via next generation sequencing and other activities. Training life scientists in the analysis of these datasets, particularly sequencing data, is a core activity for a large proportion of the bioinformatics training and education community. As a community, we spend a great deal of time teaching people how to analyze data using specific tools and best practices workflows for these types of data. However, it is commonplace for some researchers, to take the resulting gene sets or conclusions forward to further experiments but place little further thought on the raw data from which they gained these insights.
Against the backdrop of reusable data and good data stewardship practices, are our training programs keeping pace with the changing landscape? Are bioinformatics trainers aware of these international initiatives in data use and reuse? As trainers are we equipping our students to make the most of their own data by understanding both the value of their own data and its potential value to others? What steps are we taking to help trainees better manage and value their data?
Through a series of presentation showcasing current practices in and identifying future needs in better data practices to enable reuse of research data within the life science community, this workshop aims to highlight how we as a community can be most effective in bringing best practices in data management to trainees and students in educational environments.
This workshop will consist of three presentations on topics ranging from the basics of FAIR principles, how we can apply these to bioinformatics training programs, examples of application and how we can further FAIR principles by teaching workshops on FAIR principles.
Madelaine Gogol, Stowers Institute, United States
Hemant Kelkar, UNC-Chapel Hill, United States
Alastair Kerr, University of Edinburgh, Scotland
Brent Richter, Partners HealthCare of Massachusetts General and Brigham and Women’s Hospitals, United States
Alberto Riva, University of Florida, United States
The bioinformatics core workshop is a workshop by practitioners and managers of Core Facilities for all members of core facilities, including scientists, engineers, analysts, operations and management staff. In this 15th year of bringing the Core community together at ISMB, we will explore in-depth three topics relevant to bioinformatics core facilities through lightning talks that broadly explore each area followed by small-group break out discussions with insights brought back to the full audience for further discussion and knowledge share.
We have partitioned this 2 hour workshop into four sections: 3 sections devoted to lightning talks (Parts A, B, C) introducing the topic areas for the longer and in-depth 4th section (Part D) that further explores the topic areas within breakout sessions and full-audience discussions.
Part A: Strategies for Hiring, Recruiting, and Interviewing new bioinformaticians
Methods to find, interview and hire highly successful staff and bioinformaticians for a core facility. Speakers will introduce experience and challenges including finding and hiring people, interview techniques and questions and best practices for recruiting candidates
Part B: Containerization, Clouds, and Workflows
Topics to be covered include cloud infrastructure recommendations and limitations, key datasets of value hosted in the cloud, containerization technology that works and workflow tool development and results.
Part C: When good experiments go bad: Negotiating experiment quality failures
A non-exhaustive survey of methods and successes in detecting failures and exploring guidelines for terminating bad projects.
Part D: Small group discussion
During this longer session, audience members will divide into groups based on their own interests. Groups will come up with their main take away points and bring them back to the main audience for knowledge sharing and for further discussion. Topics may include all previous presentation areas as well as other areas of interest to running or working within a bioinformatics core facility such as single cell analysis or long read analysis.