Education COSI

Attention Presenters - please review the Speaker Information Page available here
Schedule subject to change
All times listed are in CDT
Tuesday, July 12th
10:30-11:10
Keynote Presentation: Thoughts and experiences of organising and delivering bioinformatics training in low- and middle-income countries
Room: GJ
Format: Live-stream

Moderator(s): Sarah Morgan

  • Benjamin Moore, EMBL-EBI, United Kingdom


Presentation Overview: Show

There are abundant opportunities for accessing bioinformatics training in high-income countries (HICs), with well-equipped facilities and participants and trainers requiring minimal travel and financial costs alongside a range of general advice for developing short bioinformatics training courses. However, due to the lack of resources and limited expertise in bioinformatics in LMICs in general, regionally targeted bioinformatics training in low- and middle-income countries (LMICs) often requires more extensive local and external support, organization, and travel.

Recently, there has been a growth of training capacity strengthening initiatives in LMICs, such as the Pan African Bioinformatics Network for Human Heredity and Health in Africa (H3ABioNet) Initiative, the Capacity Building for Bioinformatics in Latin America (CABANA) Project, the Asia Pacific BioInformatics Network (APBioNet), and the Wellcome Connecting Science Courses and Conferences program. One of the important strands of these initiatives is a drive to organize and deliver valuable bioinformatics training in LMICs, but this presents a unique set of challenges.

Through delivering training and collaborating with a number of capacity strengthening initiatives over a number of years, I have accrued a range of experience of organizing and delivering bioinformatics workshops in LMICs. In this talk, I will share the key thoughts and experiences that have shaped my approach to organizing courses in LMICs, taking into consideration the unique challenges and opportunities low-resource settings.

11:10-11:30
Proceedings Presentation: An approachable, flexible, and practical machine learning workshop for biologists
Room: GJ
Format: Live from venue

Moderator(s): Sarah Morgan

  • Fangzhou Mu, University of Wisconsin-Madison, United States
  • Rosemary Russ, University of Wisconsin-Madison, United States
  • Milica Cvetkovic, University of Wisconsin-Madison, United States
  • Debora Treu, University of Wisconsin-Madison, United States
  • Anthony Gitter, University of Wisconsin-Madison, United States
  • Christopher Magnano, University of Madison-Wisconsin, United States


Presentation Overview: Show

The increasing prevalence and importance of machine learning in biological research has created a need for machine learning training resources tailored towards biological researchers.
However, existing resources are often inaccessible, infeasible, or inappropriate for biologists because they require significant computational and mathematical knowledge, demand an unrealistic time-investment, or teach skills primarily for computational researchers.
We created the Machine Learning for Biologists (ML4Bio) workshop, a short, intensive workshop that empowers biological researchers to comprehend machine learning applications and pursue machine learning collaborations in their own research.
The ML4Bio workshop focuses on classification and was designed around 3 principles: (a) focusing on preparedness over fluency or expertise, (b) necessitating minimal coding and mathematical background, and (c) requiring low time investment.
It incorporates active learning methods and custom open source software that allows participants to explore machine learning workflows.
After multiple sessions to improve workshop design, we performed a study on 3 workshop sessions.
Despite some confusion around identifying subtle methodological flaws in machine learning workflows, participants generally reported that the workshop met their goals, provided them with valuable skills and knowledge, and greatly increased their beliefs that they could engage in research that uses machine learning.
ML4Bio is an educational tool for biological researchers, and its creation and evaluation provides valuable insight into tailoring educational resources for active researchers in different domains.

11:30-11:50
The EMBL-EBI Competency Hub: a tool to support training design and professional development
Room: GJ
Format: Live from venue

Moderator(s): Sarah Morgan

  • Marta Lloret-Llinares, EMBL-EBI, United Kingdom
  • Adam Broadbent, EMBL-EBI, United Kingdom
  • Cath Brooksbank, EMBL-EBI, United Kingdom
  • Nikiforos Karamanis, EMBL-EBI, United Kingdom
  • Vera Matser, EMBL-EBI, United Kingdom
  • Joseph Rossetto, EMBL-EBI, United Kingdom
  • Mahfouz Shehu, EMBL-EBI, United Kingdom
  • Prakash Singh Gaur, EMBL-EBI, United Kingdom


Presentation Overview: Show

Professions within the life sciences domain are evolving rapidly with the adoption of new methods and technologies, which requires researchers and other professionals to incorporate new skills to stay at the forefront of the latest developments in their discipline.

At EMBL-EBI we have built a tool to support continuous professional development by facilitating the identification of training needs and the access to relevant training resources. The Competency Hub hosts competency frameworks for different groups of professionals, including the ISCB framework for students and professionals in computational biology. It enables the definition of career profiles, which helps identify the abilities required for a specific role, e.g. bioinformatician, and therefore, inform career choices and professional development.

The tool allows the association of training resources with the competencies, so that users can find where to start learning. It includes learning pathways, curated sets of resources to address a specific challenge within a field or community, e.g. how to access high performance computing resources to run simulations.

The Competency Hub has been developed in close consultation with competency experts and by gathering input from target users through an iterative user experience design approach to make sure that it meets their needs.

11:50-12:10
GL4U: Bioinformatics training for students and educators using space omics data
Room: GJ
Format: Live-stream

Moderator(s): Sarah Morgan

  • Amanda M. Saravia-Butler, KBR, NASA Ames Research Center, Moffett Field, CA 94035, USA, United States
  • Lauren M. Sanders, Blue Marble Space Institute of Science, Seattle, WA 98104, USA, United States
  • Sigrid S. Reinsch, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA, United States
  • Steven Boring, Department of Computer Science, San Jose State University, San Jose, CA 95129, USA, United States
  • Saba Hussain, USRA, NASA Ames Research Center, Moffett Field, CA 94035, USA, United States
  • Samrawit G. Gebre, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA, United States
  • Arman Seuylemezian, JPL Planetary Protection Center of Excellence, 4800 Oak Grove Drive, Pasadena, CA 91109, USA, United States
  • Lisa Guan, JPL Planetary Protection Center of Excellence, 4800 Oak Grove Drive, Pasadena, CA 91109, USA, United States
  • Alvin L. Smith, JPL Planetary Protection Center of Excellence, 4800 Oak Grove Drive, Pasadena, CA 91109, USA, United States
  • Parag Vaishampayan, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA, United States
  • Philip Heller, Department of Computer Science, San Jose State University, San Jose, CA 95129, USA, United States
  • Sylvain V. Costes, Space Biosciences Division, NASA Ames Research Center, Moffett Field, CA 94035, USA, United States


Presentation Overview: Show

NASA’s GeneLab project provides researchers open access to space-relevant experiment multi-omics data that can be mined to understand the effects of spaceflight on biological systems. To maximize the number of scientists who understand and utilize GeneLab data and data processing pipelines, GeneLab has created GeneLab for Colleges and Universities (GL4U).

GL4U provides space biology-relevant training in bioinformatics to the next generation of scientists through direct and indirect approaches. The GeneLab team plans to host two annual data processing bootcamps, one for college-level students (direct) and one for college educators (indirect – training of trainers), in which participants learn to analyze GeneLab’s space-relevant omics data.

The GL4U direct training pilot program was conducted in June 2021. During the pilot, students participated in a week-long bootcamp consisting of space biology-specific lectures and hands-on instruction using Jupyter Notebooks to analyze RNA sequence data. This pilot demonstrated the capacity of GL4U for training young scientists and encouraging data re-use. During the educator pilot, scheduled for June 2022, educators will receive materials and training to enable them to run the bootcamp at their home institutions or alternatively to adapt the content to implement within existing courses, thereby extending the reach of this initiative.

12:10-12:30
Cultivating a data-driven computational culture within biomedical institutions by empowering graduate students with code-based data science skills
Room: GJ
Format: Live from venue

Moderator(s): Sarah Morgan

  • Cynthia Ronkowski, University of Southern California, United States
  • Kerui Peng, University of Southern California, United States
  • Dottie Yu, University of Southern California, United States
  • Ram Ayyala, University of Southern California, United States
  • Sergey Knyazev, University of California, Los Angeles, United States
  • Maryann Wu, University of Southern California, United States
  • Ian Haworth, University of Southern California, United States
  • Jennica Zaro, University of Southern California, United States
  • Serghei Mangul, University of Southern California, United States


Presentation Overview: Show

To address an emerging disparity between biomedical researchers’ lack of computational prowess and the rising importance of massive omics datasets in translational and clinical research, we created a 2-unit graduate-level course to teach reproducible data science to biomedical graduate students at the University of Southern California. In our course, we emphasized data analysis, data visualization, and open data science using code-based assignments written within a free online Jupyter notebook environment. This framework enabled us to combine the benefits of literate programming and cloud computing to promote scientific transparency in a manner accessible to anyone with an internet connection. Furthermore, we implemented simplified explanations of programming fundamentals to gradually ease students into using Python commands, empowering them with the foundational knowledge necessary to troubleshoot one’s own code. We measured the effectiveness of this strategy using an anonymous student survey conducted during the first and final lectures. Our findings show greater self-reported confidence, increased comprehension of the course material, and an agreement that our course was highly impactful. Based on these results, we conclude that our course framework and teaching approach represents an effective strategy for introducing the fundamentals of modern biomedical data science to students with minimal prior computer science background.

14:30-15:30
The Bioinformatics Education Summit 2022
Room: GJ
Format: Live-stream

Moderator(s): Michelle Brazas

  • Michelle Brazas, Ontario Institute for Cancer Research, Canada


Presentation Overview: Show

Members from the global bioinformatics education and training community annually hold an Education Summit to move forward a variety of education and training activities. This year’s virtual 3-day summit was hosted by Asia Pacific Bioinformatics Network (APBioNET) and was attended by 90+ people from 35 countries. Presented here is a summary of each working group’s activities and outputs from the 2022 Education Summit.

16:00-16:20
Proceedings Presentation: Characterizing domain-specific open educational resources by linking ISCB Communities of Special Interest to Wikipedia
Room: GJ
Format: Live from venue

Moderator(s): Russell Schwartz

  • Alastair M. Kilpatrick, Centre for Regenerative Medicine, University of Edinburgh, UK, United Kingdom
  • Farzana Rahman, School of Mathematics, Computer Science and Engineering, City, University of London, UK, United Kingdom
  • Audra Anjum, Office of Instructional Innovation, Ohio University, USA, United States
  • Sayane Shome, Department of Anesthesiology, Perioperative and Pain Medicine, Stanford School of Medicine, Stanford University, USA, United States
  • K.M. Salim Andalib, Biotechnology and Genetic Engineering Discipline, Khulna University, Khulna, Bangladesh, Bangladesh
  • Shrabonti Banik, Faculty of Veterinary, Animal and Biomedical Sciences, Sylhet Agricultural University, Sylhet, Bangladesh, Bangladesh
  • Sanjana F. Chowdhury, Department of Genetic Engineering and Biotechnology, Shahjalal University of Science and Technology, Sylhet, Bangladesh, Bangladesh
  • Peter Coombe, Wikipedia volunteer, United Kingdom
  • Yesid Cuesta Astroz, Colombian Institute of Tropical Medicine, CES University, Medellín, Colombia, Colombia
  • J. Maxwell Douglas, Department of Molecular Oncology, BC Cancer Agency, Vancouver, BC, Canada, Canada
  • Pradeep Eranti, UMRS-1124, INSERM, Université de Paris, Paris, France, France
  • Aleyna D. Kıran, Department of Bioengineering, Ege University, Turkey, Turkey
  • Sachendra Kumar, IISc Mathematics Initiative, Indian Institute of Science, Bengaluru, India, India
  • Hyeri Lim, Department of Biomedical Data Intelligence, Graduate School of Medicine, Kyoto University, Kyoto, Japan, Japan
  • Valentina Lorenzi, Wellcome Sanger Institute, Hinxton, Cambridge, UK; European Bioinformatics Institute (EMBL-EBI), Hinxton, UK, United Kingdom
  • Tiago Lubiana, School of Pharmaceutical Sciences, University of São Paulo, São Paulo, Brazil, Brazil
  • Sakib Mahmud, Biotechnology and Genetic Engineering Discipline, Khulna University, Khulna, Bangladesh, Bangladesh
  • Rafael Puche, Genetics and Forensic Studies Unit (UEGF), Venezuelan Institute of Scientific Research (IVIC), Venezuela, Venezuela
  • Agnieszka Rybarczyk, Institute of Computing Science, Poznan University of Technology, Poznan, Poland, Poland
  • Syed Muktadir Al Sium, Institute of Epidemiology, Disease Control And Research, Dhaka, Bangladesh, Bangladesh
  • David Twesigomwe, Sydney Brenner Institute for Molecular Bioscience (SBIMB), University of the Witwatersrand, Johannesburg, South Africa, South Africa
  • Tomasz Zok, Institute of Computing Science, Poznan University of Technology, Poznan, Poland, Poland
  • Christine A. Orengo, Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK, United Kingdom
  • Iddo Friedberg, Program in Bioinformatics and Computational Biology, Iowa State University, Ames, IA, USA, United States
  • Janet F. Kelso, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany, Germany
  • Lonnie Welch, School of Electrical Engineering and Computer Science, Ohio University, USA, United States


Presentation Overview: Show

Motivation: Wikipedia is one of the most important channels for the public communication of science and is frequently accessed as an educational resource in computational biology. Joint efforts between the International Society for Computational Biology (ISCB) and the Computational Biology taskforce of WikiProject Molecular Biology (a group of expert Wikipedia editors) have considerably improved computational biology representation on Wikipedia in recent years. However, there is still an urgent need for further improvement in quality, especially when compared to related scientific fields such as genetics and medicine. Facilitating involvement of members from ISCB Communities of Special Interest (COSIs) would improve a vital open education resource in computational biology, additionally allowing COSIs to provide a quality educational resource highly specific to their subfield.

Results: We generate a list of around 1,500 English Wikipedia articles relating to computational biology and describe the development of a binary COSI-Article matrix, linking COSIs to relevant articles and thereby defining domain-specific open educational resources. Our analysis of the COSI-Article matrix data provides a quantitative assessment of computational biology representation on Wikipedia against other fields and at a COSI-specific level. Furthermore, we conducted similarity analysis and subsequent clustering of COSI-Article data to provide insight into potential relationships between COSIs. Finally, based on our analysis, we suggest courses of action to improve the quality of computational biology representation on Wikipedia, enhancing this educational resource for all parties.

16:30-16:40
A blended approach to supporting learners through online bioinformatics training
Room: GJ
Format: Live from venue

Moderator(s): Russell Schwartz

  • Anna Swan, EMBL-EBI, United Kingdom
  • Ajay Mishra, EMBL-EBI, United Kingdom
  • Alexandra Holinski, EMBL-EBI, United Kingdom
  • Dayane Rodrigues Araujo, EMBL-EBI, United Kingdom
  • Sarah Morgan, EMBL-EBI, United Kingdom


Presentation Overview: Show

EMBL-EBI provides a range of freely accessible online training, including self-paced tutorials, recorded webinars and materials from live courses.

Feedback from users identified that, particularly for those new to bioinformatics, it can be challenging to identify which of the many online training options are most suitable for them. To assist learners, a collection of introductory online training from EMBL-EBI was developed. This self-paced collection includes online tutorials and videos introducing the topics of bioinformatics and data management, as well as EMBL-EBI resources.

To support learners in the completion of the collection, a blended approach to learning was developed. Asynchronously, the collection encourages learners to provide their own input, and learn from others, by giving their own answers to questions such as ‘what is bioinformatics?’. Two synchronous question and answer webinars were then delivered on the topics of ‘genes and gene expression’ and ‘proteins and structures’. These webinars gave learners the opportunity to ask questions to a panel of EMBL-EBI resource experts.

To expand this blended approach, new collections have recently been released on the topics of chemical biology, biocuration, and finding and using publicly available data, with more collections currently in development, and future question and answer webinars planned.

16:40-17:20
Inclusive training in computational biology
Room: GJ
Format: Live from venue

Moderator(s): Russell Schwartz

  • Charla Lambert, Cold Spring Harbor Laboratory/SACNAS, US


Presentation Overview: Show

To broaden participation in computational biology, both the pedagogy and environments we use to train future researchers must be made more accessible and inclusive. This talk will be a compendium of stories about inclusive computational biology training through the Cold Spring Harbor Laboratory Meetings & Courses Program and similar advanced, short-form training opportunities, as well as from members of SACNAS, the Society for Advancement of Chicanos/Hispanics & Native Americans in Science. The goal is to spark discussion among researchers who are also dedicated educators as to what they might do in their own teaching and training to help ensure computational biology is a broadly accessible and inclusive discipline.

17:20-18:00
Panel: Inclusiveness in Bioinformatics Education
Room: GJ
Format: Live from venue

Moderator(s): Russell Schwartz


Presentation Overview: Show

This session is a panel discussion between panelists and the Education COSI audience on the topic of...

Wednesday, July 13th
10:50-11:50
Keynote Presentation: Riding the Bicycle: Including all Scientists on a Path to Excellence
Room: Lecture Hall
Format: Live from venue

  • Jason Williams, Cold Spring Harbor Laboratory, US


Presentation Overview: Show

Life science is rapidly increasing in interdisciplinarity, making career-spanning learning critical....