Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide


Education COSI

Track Chairs

Annette McGrath
Venkata Satagopam
Patricia Palagi


Schedule subject to change
Tuesday, July 14th
10:40 AM-11:20 AM
COSI Education Keynote Talk: Empowering usable, and comprehensive bioinformatics training
Format: Live-stream

  • Bérénice Batut, University of Freiburg, Germany

Presentation Overview: Show

With the explosion of biological data, the primary challenge is not how to store the data, nor what computational resources to use to process them, but it is the general lack of researchers' understanding to manipulate and analyse these data. This problem could be solved with comprehensive and up-to-date bioinformatics training. How can we build such an infrastructure? Over the last few years, I had the opportunity to lead and contribute to several training programs and communities in bioinformatics. In this talk, I will share some lessons learned and ideas to build usable, comprehensive, and empowering bioinformatics training infrastructure. I will talk about community-driven development of free, FAIR and reusable material, online and hybrid training, mentoring, and further illustrate these concepts with concrete examples from my work.

11:20 AM-11:40 AM
Online learning from EMBL-EBI with the new and improved Train online
Format: Live-stream

  • Cath Brooksbank, EMBL-EBI, United Kingdom
  • Anna Swan, EMBL-EBI, United Kingdom
  • Melissa Burke, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), United Kingdom
  • Ajay Mishra, EMBL-EBI, United Kingdom
  • Joseph Rossetto, EMBL-EBI, United Kingdom
  • Nikiforos Karamanis, EMBL-EBI, United Kingdom
  • Prakash Singh Gaur, EMBL-EBI, United Kingdom
  • Adam Broadbent, EMBL-EBI, United Kingdom
  • Peter Walter, EMBL-EBI, United Kingdom
  • Carla Oliveira, EMBL-EBI, United Kingdom
  • Sarah L. Morgan, EMBL-EBI, United Kingdom

Presentation Overview: Show

EMBL-EBI’s online learning platform, Train online has a new look. Since it’s development in 2011 it has grown into a resource accessed by nearly 600,000 people per year globally. Containing over 80 courses, it provides training on EMBL-EBI data resources and tools, and basic concepts in bioinformatics and data analysis.

Our aim in re-development was to provide increased interactivity and improved performance for our learners, along with an updated style. This enables online learning from EMBL-EBI to be more engaging, user friendly and ultimately have greater learner impact.

Prior to redesign, we worked with trainees and authors to identify challenges to taking and writing a course. A number of platforms were considered and we ultimately decided on WordPress. This provides a clean, modern look for trainees and simple course development for trainers. Increased interactivity has been added using H5P, enabling quick creation of games, quizzes and other interactive content to support and assess learning.

Feedback on the first updated course has been extremely positive, indicating improved engagement and retention of trainees. It was also pivotal to our design revisions before finalising the new look.

Further development of Train online is planned, focusing on personalisation and collaboration in online learning.

12:00 PM-12:20 PM
HTrainDB: H3Africa Training Database
Format: Live-stream

  • Nicola Mulder, University of Cape Town, South Africa
  • Zahra Mungloo-Dilmohamud, University of Mauritius, Mauritius
  • Rolanda Julius, University of Cape Town, South Africa
  • Confidence Mothiba, University of Cape Town, South Africa
  • Jean-Michel Serufuri, Infoscope, Canada
  • Suresh Maslamoney, University of Cape Town, South Africa
  • Shakuntala Baichoo, University of Mauritius, Mauritius
  • Victoria Nambaware, University of Cape Town, South Africa
  • Michelle Skelton, University of Cape Town, South Africa

Presentation Overview: Show

One of the main aims of the H3Africa Consortium is to improve its research capacity in genomics, bioinformatics and health in Africa. A number of training initiatives in the form of workshops, seminars and related activities is organized by the Consortium in view of strengthening its research capacity and training world-class trainers. Given the large amount of training activities and participants, it is imperative to maintain a centralized record of all pertinent information. The database has been created to monitor the training efforts and evaluate the effectiveness of this research capacity building initiative. HTrainDB has several functionalities including capturing several aspects of the consortium membership, registration for Consortium meetings, advertising and recording training activities, creating meeting polls, creating survey questionnaires and webforms for various purposes. It also lists the publications of all members and highlights their career timeline. It provides a centralized system to keep records related to the consortium membership, projects and activities. To this end it can capture aggregate data and showcase capacity building in genomics research in Africa. HTrainDB has more than 600 members and is exemplary to inform other research projects or consortia with large memberships on how to manage and archive research related administrative data.

12:20 PM-12:40 PM
Staff Education to Accelerate the Cloud Adoption
Format: Pre-recorded with live Q&A

  • David Yuan, European Bioinformatics Institute, United Kingdom
  • Tony Wildish, European Bioinformatics Institute, United Kingdom
  • Chandra Deep Tiwari, EMBL-EBI, United Kingdom

Presentation Overview: Show

The European Bioinformatics Institute (EMBL-EBI) is part of EMBL. We have a large number of staff developing databases and tools to host, analyse and share data and analytic results openly in the life sciences. Staff training to adopt new technologies such as cloud computing is very challenging. We have defined the long-term strategy to adopt cloud infrastructure to provide better access to scientists in Europe and around the world to accelerate their research with the diverse data stored, verified and visualised by EBI.


As a result, many research and service teams have launched their projects with the new skills in the clouds. This team has been invited to deliver cloud workshops for EU projects in person for EOSC-Life and remotely for BioExcel with additional projects in the near future. The Cloud Roundtable forum has now been extended to include participants from Wellcome Sanger Institute to foster closer cooperation in using cloud technologies with a new joint project proposed. With this systematic staff education, we have made our first step to lead EBI and EU projects to adopt cloud technologies.

2:00 PM-2:20 PM
Format: Pre-recorded with live Q&A

  • Verena Ras, H3ABioNet; University of Cape Town, South Africa
  • Gerrit Botha, H3ABioNet; University of Cape Town, South Africa
  • Sumir Panji, H3ABioNet; University of Cape Town, South Africa
  • Shaun Aron, H3ABioNet; University of Witwatersrand, South Africa
  • Nicola Mulder, University of Cape Town, South Africa

Presentation Overview: Show

With more microbiome studies conducted by African based research groups, there is an increasing demand for knowledge in the design and analysis of microbiome studies and data, but high-quality bioinformatics courses are often hampered by factors such as lack of computational infrastructure and local expertise, among others. To address this need, H3ABioNet developed an intermediate 16S rRNA analysis course alongside experienced microbiome researchers who identified key topics ranging from designing microbiome studies to more practical topics such as introductory high-performance computing, microbiome analysis pipelines and downstream analyses conducted in R.

Tools used in the course were packaged in Singularity containers to remove the overhead of installing individual tools, versions and libraries and a separate container was created for downstream analysis of results using Rstudio. The pulling, running and testing of the containers, software and analysis on various clusters was performed prior to the start of the course by all hosting classrooms. The pilot ran successfully in 2019 across 23 sites registered in 11 African countries, with more than 200 participants formally enrolled. It provides a model for delivering topic specific bioinformatics courses across Africa which overcome barriers such as unequal infrastructures, geographical distance, access to expertise and educational materials.

2:20 PM-2:40 PM
Format: Pre-recorded with live Q&A

  • Patricia Carvajal-Lopez, CABANA Project/EMBL-EBI/UABC, Mexico
  • Cath Brooksbank, EMBL-EBI, United Kingdom

Presentation Overview: Show

Bioinformatics education is essential for supporting R&D in Latin America; consequently, it is imperative to understand its status and to support challenged audiences. The number of bioinformatics programs in LatAm is low compared with developed countries; ~16% non-biotechnological undergraduate programs in life sciences offer bioinformatics. A project aimed at analysing the current status of bioinformatics education in LatAm and supporting training for undergraduate programs in life sciences is presented herein. It is a secondment project supported by CABANA (www.cabana.online) that comprises: 1) undergraduate training modules in Spanish and, 2) status analysis and trainer support. Modules: LatAm trainers assigned high-level competency requirements to undergraduates in life sciences for basic and applied biology, and scientific computing skills using the ISCB Competency Framework. Accordingly, face-to-face and online training modules are being created, will be peer-reviewed by CABANA partners and evaluated in test groups. Status and trainer support: A status analysis of bioinformatics programs is being performed jointly with LatAm researchers. Status analysis of undergraduate programs in life sciences samples one country, Mexico. A train-the-lecturer program is being created for support and knowledge exchange and to promote the creation of a community of practice for bioinformatics trainers and educators.

2:40 PM-3:00 PM
Applying best practices to enhance bioinformatics training in Switzerland
Format: Live-stream

  • Wandrille Duchemin, SIB Swiss Institute of Bioinformatics, Switzerland
  • Isabelle Dupanloup, SIB Swiss Institute of Bioinformatics, Switzerland
  • Diana Marek, SIB Swiss Institute of Bioinformatics, Switzerland
  • Grégoire Rossier, SIB Swiss Instititute of Bioinformatics, Switzerland
  • Margaux Roulet, SIB Swiss Instititute of Bioinformatics, Switzerland
  • Frédéric Schütz, SIB Swiss Institute of Bioinformatics, Switzerland
  • Monique Zahn, SIB Swiss Institute of Bioinformatics, Switzerland
  • Patricia M. Palagi, SIB Swiss Institute of Bioinformatics, Switzerland

Presentation Overview: Show

SIB Swiss Institute of Bioinformatics training courses are created by following a cycle of best practices in training, which have been defined by the trainers’ communities in ISCB, GOBLET, ELIXIR and the SIB Training group. Trainers from the SIB Training group are encouraged to attend Train the trainer courses. As a consequence, SIB courses are clearly defined with specific learning objectives, target audiences, prerequisites, and active learning activities. Course pages are described with Bioschemas specifications and metadata, which provide valuable information to enable trainees to assess whether the courses meet their needs and background knowledge, a step towards FAIR training. Quality and impact metrics, together with training needs, are continuously collected, providing indications as to whether courses and the annual program need to be adapted. The integration of these best practices, together with the extensive expertise in bioinformatics from the SIB scientists, has resulted in courses that are consistently well evaluated and appreciated by PhD students, postdocs, and their PIs.

3:20 PM-3:40 PM
Data Science Training for Experimental Biology Graduate Students
Format: Pre-recorded with live Q&A

  • Christina Akirtava, Carnegie Mellon University, United States
  • Laura Ochs Pottermeyer, Carnegie Mellon University, United States
  • Emily Daniels Weiss, Carnegie Mellon University, United States
  • Russell Schwartz, Carnegie Mellon University, United States

Presentation Overview: Show

The life sciences education community has long recognized the need to provide more rigorous quantitative and computational training essential to current research practice. Here we describe an effort to meet this need for a cohort of experimental life sciences graduate students assumed to have no prior biostatistics or bioinformatics training. We sought to provide a practical and accessible introduction via a course organized around analyzing real primary experimental data in series of modules, each covering a distinct biological domain and data type. The course in part uses lecture material on biological problems and data sources as well as cross-cutting topics in bioinformatics and biostatistics. These are brought into practice with hands-on analysis in class and in homework of primary data in R. Assessment suggests that the course produces modest gains in reasoning correctly about problems in biological data analysis and experimental design, particularly via increasing ability to draw on quantitative knowledge. Future work will aim to extend students' ability for more complex coding tasks, incorporate new modules, and better develop course materials for export.

3:40 PM-4:00 PM
Embedding skills for a new profession by teaching programming in an immersive and authentic environment
Format: Pre-recorded with live Q&A

  • Frances Hooley, University of Manchester, United Kingdom
  • Peter Causey-Freeman, University of Manchester, United Kingdom

Presentation Overview: Show

Clinical Bioinformatics combines computer science with genomics in clinical practice. Trained clinical bioinformaticians are in short supply necessitating creative and flexibly-delivered education to fill the skills-gap. Our Introduction to Programming unit launched in 2019 as part of a PG-Cert teaching the fundamentals of Clinical Bioinformatics to a diverse cohort of distance learners.

The unit simulated real-world experiences by building a situated learning environment that used agile project methods and authentic problem-solving activities to emulate clinical programming best-practice. Clear instructions, signposting and support, ensured comfort with course materials delivered using Jupyter Notebooks via GitHub. This use of industry-standard platforms and downloadable content also encouraged post-course lifelong-learning.

The unit followed a social constructivist model geared to help students to learn individually and as a team. Synchronous online support from experienced facilitators helped encourage group-based peer-to-peer support which afforded more time for educators to support struggling students.

By prioritising pedagogy over technology, the learning design resulted in incremental coding activities supporting a variety of learners. Methods such as Sprints provided real-world problem-based learning using real user-stories. The students directly contributed to the clinical bioinformatics toolkit by developing resources for personal practice or for co-development of the VariantValidator software, used in clinical practice worldwide.

4:00 PM-4:20 PM
An Introduction to Modern Computational Biology through Microbiome Research for High School Students
Format: Pre-recorded with live Q&A

  • Joshua Kangas, Carnegie Mellon University - Computational Biology Department, United States
  • Phillip Compeau, Carnegie Mellon University - Computational Biology Department, United States

Presentation Overview: Show

The Pre-College Program in Computational Biology (http://www.cbd.cmu.edu/education/pre-college-program-in computational-biology/) is a three-week annual summer educational program in computational biology designed for high school students that launched in July 2019. The curriculum was oriented around the problem of understanding the microbiomes present in Pittsburgh’s three rivers. Students in this program collected water samples from multiple locations in the three rivers, performed wet-lab experiments to capture data from their samples, and then wrote algorithms to analyze the data that they generated, comparing the results against those from software used by current scientists. Students implemented algorithms for sequence alignment, genome assembly, gene prediction and annotation, image analysis, and machine learning-based microbiome analysis. The program demonstrated the vital interplay between experimental and computational biology to an audience of students who would likely not have had exposure to either subject. We reflect on our experience in the first year of this program and briefly discuss the results of our students' work, which led to two research manuscripts.

4:20 PM-4:40 PM
Introducing genome assembly to the general public through interactive word games
Format: Live-stream

  • Mihai Pop, University of Maryland, College Park, United States
  • Jacquelyn S Meisel, University of Maryland, College Park, United States
  • Victoria Cepeda Espinoza, University of Maryland, United States
  • Kiran Javkar, University of Maryland, United States
  • Dylan Taylor, University of Maryland, United States

Presentation Overview: Show

Reconstructing genomes from DNA sequencing reads - genome assembly - is a fundamental task in genomics that is the foundation for many downstream analyses. Genome assembly also reveals the power provided by the combination of biology and computer science - innovations in assembly algorithms were critical to the genomic revolution. To introduce these concepts to the general public and to illustrate computational thinking paradigms related to assembly algorithms, we developed a simple word game similar to magnetic poetry kits.

The game involves reconstructing repetitive phrases from fragments printed on magnets. The size of the fragments and complexity of the phrases can be varied to adjust the level of difficulty. Using a metal whiteboard as a backing for the game also creates the opportunity for introducing graph-based solutions to the genome assembly problem, while collaborative team teaching within a classroom setting also enables a discussion of parallel algorithms.

In our presentation, we will describe lesson plans built around this game and highlight our experiences in deploying them at Maryland Day (an open house event organized at the University of Maryland each spring) and within a summer camp aimed at introducing K-12 students to computer science.

5:00 PM-5:20 PM
Guidelines for curriculum and course development in higher education and training
Format: Live-stream

  • Rochelle E. Tractenberg, Georgetown University and the Collaborative for Research on Outcomes and -Metrics, United States
  • Jessica M Lindvall, National Bioinformatics Infrastructure Sweden; Department of Biochemistry and Biophysics, Stockholm University, Sweden
  • Terri Attwood, The University of Manchester, United Kingdom
  • Allegra Via, CNR, The National Research Council of Italy, Italy

Presentation Overview: Show

Background: Curriculum and instructional development should follow a formal process. Although the focus in formal curriculum theory is on long-term programs of study, the process is also applicable to shorter-form Learning Experiences (LEs) (single courses, lessons, or training sessions). Successful curricula and instruction support learners as they develop from entry-level performance to the minimum qualification for completing a program or course, articulated in terms of Learning Outcomes (LOs). These considerations have been encapsulated in an iterative model of curriculum and instructional design, with guidelines for its use.

Output and conclusion: The starting point is the articulation of target LOs: everything follows from these, including the selection of LEs, content, the development of assessments, and evaluation of the resulting curriculum/instruction. The iterative process can be used in curriculum and instructional development, and provide a set of practical guidelines for curriculum and course preparation. The essential features effective curriculum and instruction (i.e., that achieves its stated LOs for the majority of learners) is presented here, to offer practical guidance and support for devising and evaluating both short- and long-form teaching.

5:20 PM-6:00 PM
COSI Education Keynote Talk: Online Data Science Education and its effect on my class room teaching
Format: Live-stream

  • Rafael Irizarry, Professor and Chair of the Department of Data Science at Dana-Farber Cancer Institute and Professor of Applied Statistics at Harvard, United States

Presentation Overview: Show

Educational institutions across the world are responding to the unprecedented demand of training in statistics and data science by the creation of new courses, curriculums and degrees in applied statistics and data science. We have participated in two data science courses taught at Harvard and the creation of an online course of data analysis for the life sciences, and a Data Sciences series composed of 9 courses. In this presentation, I will first try to define data science and explain what aspects of it I teach. Then I will discuss our approach to developing a MOOC based almost exclusively on real-world examples and how our lecturers revolved around dozens of exercises that required R programming to answer.