JASPAR-Suite: An open toolkit for accessing TF binding motifs
Confirmed Presenter: Aziz Khan, Computational Biology Department, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE, United Arab Emirates
Room: 03A
Format: In person
Moderator(s): Karsten Hokamp
Authors List: Show
- Aziz Khan, Computational Biology Department, Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), Abu Dhabi, UAE, United Arab Emirates
- Anthony Mathelier, Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, Oslo 0318, Norway, Norway
Presentation Overview: Show
JASPAR database (https://jaspar.elixir.no) is a widely used open-access database of manually curated, non-redundant transcription factor (TF) binding profiles across multiple species, supporting the global community of gene regulation researchers. As the field of regulatory genomics grows increasingly data-driven, JASPAR plays a vital role in providing high-quality position frequency matrices (PFMs) for TFs, enabling insights into gene expression regulation, enhancer activity, and transcriptional networks. The JASPAR database can be accessed through several user-friendly and programmatic interfaces, including a web interface for intuitive exploration, a RESTful API for cross-platform integration, the Bioconductor package for R users, and pyJASPAR—a flexible and Pythonic toolkit for both interactive and command-line access to TF motifs.
In this talk, we will demonstrate how JASPAR can be accessed using its RESTful API (https://jaspar.elixir.no/api/) from any programming environment, allowing seamless integration into bioinformatics workflows. I will also introduce pyJASPAR (https://github.com/asntech/pyjaspar), a lightweight Python package we developed to make JASPAR motif queries easy, scriptable, and reproducible—whether from a Jupyter notebook or a shell terminal. Together, these tools form the JASPAR Suite, designed to empower the scientific community with open, reproducible, and interoperable access to TF binding motifs. All the code, data, and workflows are openly available under open licenses, supporting transparency and reproducibility in computational biology research.
VueGen: automating the generation of scientific reports
Confirmed Presenter: Sebastian Ayala-Ruano, Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Denmark
Room: 03A
Format: In person
Moderator(s): Karsten Hokamp
Authors List: Show
- Sebastian Ayala-Ruano, Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Denmark
- Henry Webel, Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Denmark
- Alberto Santos Delgado, Novo Nordisk Foundation Center for Biosustainability, Denmark
Presentation Overview: Show
The analysis of omics data typically involves multiple bioinformatics tools and methods, each producing distinct output files. However, compiling these results into comprehensive reports often requires additional effort and technical skills. This creates a barrier for non-bioinformaticians, limiting their ability to produce reports from their findings. Moreover, the lack of streamlined reporting workflows impacts reproducibility and transparency, making it difficult to communicate results and track analytical processes.
Here, we present VueGen, an open-source software that addresses the limitations of current reporting tools by automating report generation from bioinformatics outputs, allowing researchers with minimal coding experience to communicate their results effectively. With VueGen, users can produce reports by simply specifying a directory containing output files, such as plots, tables, networks, Markdown text, and HTML components, along with the report format. Supported formats include documents (PDF, HTML, DOCX, ODT), presentations (PPTX, Reveal.js), Jupyter notebooks, and Streamlit web applications. To showcase VueGen’s functionality, we present two case studies and provide detailed documentation to help users generate customized reports.
VueGen was designed with accessibility and community contribution in mind, offering multiple implementation options for users with varying technical expertise. It is available as a Python package, a portable Docker image, and an nf-core module, leveraging established open-source ecosystems to facilitate integration and reproducibility. Furthermore, a cross-platform desktop application for macOS and Windows provides a user-friendly interface for users less familiar with command-line tools. The source code is freely available on https://github.com/Multiomics-Analytics-Group/vuegen. Documentation is provided at https://vuegen.readthedocs.io/.
The world’s biomedical knowledge in less than a gram: introducing the PGP incubator
Room: 03A
Format: In person
Moderator(s): Karsten Hokamp
Authors List: Show
- Peter Amstutz, Curii Corporation, United States
- Sarah Zaranek, Curii Corporation, United States
- Alexander Sasha Wait Zaranek, Curii Corporation, United States
- Zoe Ma, Curii Corporation, United States
Presentation Overview: Show
In this talk, we describe a new project, the Personal Genome Project incubator. The PGPincubator is an effort to create a distribution of open data, tools, workflows, AI models and learning materials that support validation, benchmarking, and education in bioinformatics and biomedicine for precision health and (pre-clinical) biomedical AI. In addition, the incubator is a distributed network of physical computing infrastructure used to test components included in the distribution, such as validating genomics workflows or benchmarking AI models.
To help hatch this network, PGPincubator is running a private network of “h-grams.” An h-gram is 1-4 microSD cards (3-4 weigh about a gram!) each flashed with an operating system image that can be booted on compatible commodity PC hardware. The operating system (Ubuntu) is pre-configured to act as a server suitable for home, office or lab and is accessed by other devices through a browser. Each h-gram is pre-loaded with hundreds of gigabytes of openly licensed infrastructure software, bioinformatics tools, genomic datasets, AI models, and learning resources.
The PGPincubator data and software distribution pre-loaded on the h-gram will be updated on a 6 month release schedule, inspired by Linux distribution releases. With both software and data sets distributed in versioned releases, it becomes far easier for researchers to precisely identify both software and data used in their work, for others to reproduce that work, and for students to study that work, while ensuring that validation and benchmarking methods are done fairly against a common baseline.