Metabolomics data analysis is a multi-step process that leverages many specialized software tools. This data analysis can be divided into three major steps, data processing and quality control, statistical analysis, and biological data interpretation. The tools described in this protocol are designed to enable the latter step in the analysis.
In the last decade, metabolomics has emerged as anomic science due to advances in analytical technologies such as gas chromatography-mass spectrometry, and liquid chromatography-mass spectrometry. These techniques allow the simultaneous measurements of hundreds to thousands of small molecule metabolites creating complex multidimensional data sets. Metabolomics data analysis presents several challenges for pathway based enrichment approaches.
First, a significant number of metabolites cannot be mapped onto metabolic pathways. Further, pathway coverage of secondary and lipid metabolism are not sufficient. Therefore, alternative approaches are needed for the biological interpretation of the data.
Data-driven network analysis techniques can help overcome challenges associated with knowledge-based pathway enrichment analysis for metabolomics data. For example, correlation networks can help derive relationships among both known and unknown metabolites, and can thus facilitate annotations of the unknowns.