Aby wyświetlić tę treść, wymagana jest subskrypcja JoVE. Zaloguj się lub rozpocznij bezpłatny okres próbny.
Method Article
* Wspomniani autorzy wnieśli do projektu równy wkład.
We constructed an untargeted metabolomic workflow that integrated XY-Meta and metaX together. In this protocol, we displayed how to use XY-Meta to generate a decoy spectral library from open access spectra reference, and then performed FDR control and used the metaX to quantitate the metabolites after identifying the metabolomics spectra.
Untargeted metabolomics techniques are being widely used in recent years. However, the rapidly increasing throughput and number of samples create an enormous amount of spectra, setting challenges for quality control of the mass spectrometry spectra. To reduce the false positives, false discovery rate (FDR) quality control is necessary. Recently, we developed a software for FDR control of untargeted metabolome identification that is based on a Target-Decoy strategy named XY-Meta. Here, we demonstrated a complete analysis pipeline that integrates XY-Meta and metaX together. This protocol shows how to use XY-meta to generate a decoy database from an existing reference database and perform FDR control using the Target-Decoy strategy for large-scale metabolome identification on an open-access dataset. The differential analysis and metabolites annotation were performed after running metaX for metabolites peaks detection and quantitation. In order to help more researchers, we also developed a user-friendly cloud-based analysis platform for these analyses, without the need for bioinformatics skills or any computer languages.
Metabolites play important roles in biological processes. Metabolites are often regulators of various processes like energy transfer, hormone regulations, regulation of neurotransmitters, cellular communications, and protein post-translational modifications, etc1,2,3,4. Untargeted metabolomics provides a global view of numerous metabolites5,6. With advances in mass spectrometry and chromatography technologies, the throughput of metabolome MS/MS spectra is rapidly increasing in recent years7,8,9,10,11. To identify metabolites from these huge datasets, various annotation software were developed11, such as MZmine12, MS-FINDER13, CFM-ID14, MetFrag15, and SLAW16. However, these identifications often contain many false positives. The reasons include: (1) The MS/MS spectra contain random noise, which may mislead the peak matching. (2) Isomers and differences in fragmentation energies cause multiple spectra fingerprints and thus increase the volume of the reference library. (3) The quality of reference libraries varies. A proper standard to build a good reference spectral library is needed. Therefore, a systematic false discovery rate (FDR) control for untargeted metabolomics is essential for functional metabolome research7,8,9,17.
Both the Empirical Bayes approach and Target-Decoy strategy tackled the FDR control problem generally. Kerstin Scheubert et al. showed that the Target-Decoy strategy on decoy database generated from fragmentation tree-based method is the best method for FDR control9. Xusheng Wang et al. designed a method for decoy generation based on the octet rule in chemistry and improved the precision of FDR estimation17. The spectral library for generating decoy database was demonstrated for better performance18. Here, we improved the spectral library-based method and developed a software called XY-Meta19 that can further improve FDR estimation's precision. It uses the existing reference spectral library to generate a decoy library for the FDR control under the Target-Decoy scheme. XY-Meta supports its own spectra matching and cosine similarity algorithms. It allows conventional search and iterative search modes. In the step of FDR assessment, it supports Target-Decoy concatenated mode and separated mode. For better flexibility, XY-Meta accepts external decoy libraries.
Peak detection and quantification of metabolites is also an important step of untargeted metabolome analysis. Peak detection is the main method for metabolome identification. In general, the accuracy of peak detection of metabolites was affected by multiple factors, such as noise signals of mass spectrometry, low abundance of metabolites, contaminants, and degradation products of metabolites20. When the number of samples of is too large or the liquid chromatography column was replaced in experiments of untargeted metabolome, remarkable batch effects may appear, which is a major challenge for metabolome quantitation21,22,23. Currently, software like XCMS24, Workflow4Metabolomic25, iMet-Q26, and metaX19 can perform peak detection and quantitation of untargeted metabolome, but we suggest that the pipeline of metaX is more complete and easier to use. Here, we demonstrate the process of identification and FDR control for a publicly available dataset msv000084112 using XY-Meta, and the peak detection and quantification of metabolites using metaX. This workflow only requires two groups, and each group needs at least two samples. MS/MS spectra data is needed, regardless of the mass spectrometer platform, ionization mode, charge mode, and sample type, and can support sample-based normalization and peak-based normalization. Following this example, researchers can perform metabolomics identification and quantification in an easy-to-handle way. Using this pipeline requires R programming capability. To help the researcher without any programming knowledge, we also developed a cloud analysis platform for metabolomics analysis. We demonstrated this cloud analysis platform in Supplementary Material 5.
Access restricted. Please log in or start a trial to view this content.
1. Prepare metabolomics datasets for analysis
NOTE: In this demonstration, we use metabolomics datasets without QC sample. Data for case and control groups are needed. For demonstration, we use a public dataset in GNPS database27.
2. Data format conversion
NOTE: If the dataset is the raw data generated directly from the mass spectrometer, it is usually in .raw, .wiff or .cdf format. They should be converted to mzXML and mgf formats. Here, we use the msconvert tool in ProteoWizard29 package to do the format conversion.
3. Prepare the reference spectral library for the metabolites
NOTE: XY-meta supports the reference spectral libraries only in mgf format.
4. Metabolites identification and FDR control
5. Differential analysis
NOTE: metaX is an open-source R package. Please install it according to the guide at https://github.com/wenbostar/metaX. 8GB RAM is required for this analysis.
6. Integration of qualitative and quantitative results
Access restricted. Please log in or start a trial to view this content.
The raw data of msv000084112 was converted by msconvert.exe and generated mgf files (Supplementary Material S6).
XY-Meta generated GNPS-NIST14-MATCHES_Decoy.mgf file under /database folder. This is the decoy library generated from the original reference spectral library GNPS-NIST14-MATCHES.mgf. This decoy library can be reused. When reusing this decoy library, the user should set the decoy_pattern as 1 in parameter.default file, and set the decoyinput as the absolute path of t...
Access restricted. Please log in or start a trial to view this content.
The FDR control of untargeted metabolites has been a great challenge. Here, we demonstrated a complete pipeline of large-scale untargeted metabolomics analysis (qualitative and quantitative) with FDR control. This effectively reduces the false positives, which are very common in MS analysis.
Preparing an appropriate reference spectral library for your study is a key point. A successful and sensitive MS/MS identification requires not only proper matching algorithms, but also proper reference sp...
Access restricted. Please log in or start a trial to view this content.
No conflicts of interest.
This work is supported by National Key Research and Development Program (2018YFC0910200/2017YFA0505001) and the Guangdong Key R&D Program (2019B020226001).
Access restricted. Please log in or start a trial to view this content.
Name | Company | Catalog Number | Comments |
GNPS | open source | n/a | https://gnps.ucsd.edu/ProteoSAFe/static/gnps-splash.jsp |
XY-Meta | open source | n/a | https://github.com/DLI-ShenZhen/XY-Meta |
metaX | open source | n/a | https://github.com/wenbostar/metaX |
ProteoWizard | Free Download | 3.0.22116.18c918b-x86_64 | https://proteowizard.sourceforge.io/download.html |
CHI.Client | Free Download | ndp48-x86-x64-allos-enu | http://www.chi-biotech.com/technology.html?ty=ypt |
Access restricted. Please log in or start a trial to view this content.
Zapytaj o uprawnienia na użycie tekstu lub obrazów z tego artykułu JoVE
Zapytaj o uprawnieniaThis article has been published
Video Coming Soon
Copyright © 2025 MyJoVE Corporation. Wszelkie prawa zastrzeżone