Combined MS and MS/MS deconvolution of SWATH DIA data with the DIA-NMF software for comprehensive annotation in metabolomics
Confirmed Presenter: Diana Karaki, Dèpartement Mèdicaments et Technologies pour la Santè (DMTS), MetaboHUB, Universitè Paris-Saclay, CEA, INRAE, France
Format: In person
Moderator(s): Wout Bittremieux
Authors List: Show
- Diana Karaki, Dèpartement Mèdicaments et Technologies pour la Santè (DMTS), MetaboHUB, Universitè Paris-Saclay, CEA, INRAE, France
- Annelaure Damont, Dèpartement Mèdicaments et Technologies pour la Santè (DMTS), MetaboHUB, Universitè Paris-Saclay, CEA, INRAE, France
- Antoine Souloumiac, CEA-List, Universitè Paris-Saclay, 91120 Palaiseau, France, France
- Francois Fenaille, Dèpartement Mèdicaments et Technologies pour la Santè (DMTS), MetaboHUB, Universitè Paris-Saclay, CEA, INRAE, France
- Etienne Thevenot, Dèpartement Mèdicaments et Technologies pour la Santè (DMTS), MetaboHUB, Universitè Paris-Saclay, CEA, INRAE, France
- Sylvain Dechaumet, Dèpartement Mèdicaments et Technologies pour la Santè (DMTS), MetaboHUB, Universitè Paris-Saclay, CEA, INRAE, France
Presentation Overview: Show
Data-independent acquisition (DIA), particularly Sequential Window Acquisition of All Theoretical Mass Spectra (SWATH-MS), is gaining momentum in untargeted metabolomics due to its ability to fragment all detected ions within large consecutive isolation windows in a single run. The main challenge lies in processing the resulting hybrid fragmentation data and extracting pure MS/MS spectra based on the similarity of retention time profiles from precursors and their fragment ions. We recently demonstrated the value of a Non-Negative Matrix Factorization (NMF) approach for DIA deconvolution, compared to existing peak modeling methods such as MS-DIAL and DecoMetDIA.
Here, we extended our strategy to simultaneous deconvolution of MS and MS/MS DIA data. This is not only more rigorous—since fragment ions of distinct ion species from the same molecule often share retention time profiles—but also more efficient, as MS1 pure spectra are now provided. Second, we redesigned the deconvolution strategy to extract all pure components from each retention time window in a single step, reducing redundancy and decreasing computation time. Post-processing quality filters were also included to discard weak or redundant components by analyzing their contribution to MS1 signals.
In SWATH-DIA mode, we applied the DIA-NMF software to human plasma samples spiked with 47 chemical compounds at eight known concentrations (0–10 ng/mL). DIA-NMF identified more spiked compounds than MS-DIAL and DecoMetDIA at all concentrations. It also achieved higher reverse dot-product scores, indicating a better grouping of relevant fragments. These results highlight the value of the DIA-NMF method and software for integrated metabolomics workflows.
Towards mzTab-M 2.1 - Evolving the HUPO-PSI standard format for reporting of small molecule mass spectrometry results
Confirmed Presenter: Nils Hoffmann, IBG-5, Forschungszentrum Jülich, Jülich, Germany, Germany
Format: In person
Moderator(s): Wout Bittremieux
Authors List: Show
- Nils Hoffmann, IBG-5, Forschungszentrum Jülich, Jülich, Germany, Germany
- Bo Burla, Singapore Lipidomics Incubator, Life Sciences Institute, National University of Singapore, Singapore
- Yasin El Abiead, Skaggs School of Pharmacy and Pharmaceutical Sciences, UCSD, San Diego, USA, United States
- Janik Kokot, Institute of Human Genetics, Medical University of Innsbruck, Innsbruck, Austria, Austria
- Philippine Louail, Institute for Biomedicine, Eurac Research, Bolzano, Italy, Italy
- Steffen Neumann, Leibniz Institute of Plant Biochemistry, Halle, Germany, Germany
- Kozo Nishida, RIKEN Center for Biosystems Dynamics Research, Kobe, Japan, Japan
- Thomas Payne, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, United Kingdom, United Kingdom
- Johannes Rainer, Institute for Biomedicine, Eurac Research, Bolzano, Italy., Italy
- Juan Antonio Vizcaíno, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, United Kingdom, United Kingdom
- Ozgur Yurekten, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Hinxton, Cambridge, United Kingdom., United Kingdom
Presentation Overview: Show
Mass spectrometry (MS) is central to modern large-scale metabolomics, but a lack of data format standardization for intermediate and final MS data analysis results still limits data sharing, database deposition, and reanalysis. To address this, the Human Proteome Organization’s Proteomics Standards Initiative (HUPO-PSI) and the Metabolomics Standards Initiative (MSI) originally developed mzTab-M 2.0.0 (published in 2019) as an open standard for reporting MS-based metabolomics data.
mzTab-M uses a simple, tab-separated text format designed for both human readability and computational processing, based on a JSON schema and complemented by controlled-vocabulary-defined metadata. The format is detailed in a specification document, while a reference implementation and validator ensure data quality and consistency.
The format comprehensively represents metabolomics results, including final quantification values and the identification evidence linking these values back to the raw MS features. Importantly, mzTab-M explicitly accommodates ambiguity in molecule identification, allowing researchers to clearly communicate levels of confidence. mzTab-M aims to be flexible by supporting CV-term controlled optional columns, thereby adapting to different experimental setups, applications and workflows.
Initial implementations of mzTab-M in software like xcms, mzmine, OpenMS and for submission to repositories like MetaboLights and GNPS require significant updates and extensions to the format, its documentation and implementations. Thus, in mzTab-M 2.1.0 we want to support those, as well as new MS-technologies and we want to provide improved integration with other HUPO-PSI formats for sample metadata, QC results and cross-links to mass spectra in public databases, and implement more efficient and faster serialization and deserialization options.