A subscription to JoVE is required to view this content. Sign in or start your free trial.

In This Article

  • Summary
  • Abstract
  • Introduction
  • Protocol
  • Results
  • Discussion
  • Disclosures
  • Acknowledgements
  • Materials
  • References
  • Reprints and Permissions

Summary

This study investigated the relationships between non-alcoholic fatty liver disease (NAFLD) and myocardial infarction (MI) through co-expressed genes, identifying Thrombospondin 1 (THBS1) as a biomarker. Immuno-infiltration analysis revealed CD8+ T cells and neutrophils as key factors, with THBS1 showing potential as a diagnostic tool for NAFLD and MI.

Abstract

Non-alcoholic fatty liver disease (NAFLD) and myocardial infarction (MI) are two major health burdens with significant prevalence and mortality. This study aimed to explore the co-expressed genes to understand the relationship between NAFLD and MI and identify potential crucial biomarkers of NAFLD-related MI using bioinformatics and machine learning. Functional enrichment analysis was conducted, a co-protein-protein interaction (PPI) network diagram was constructed, and support vector machine-recursive feature elimination (SVM-RFE) and least absolute shrinkage and selection operator (LASSO) techniques were employed to identify one differentially expressed gene (DEG), Thrombospondin 1 (THBS1). THBS1 demonstrated strong performance in distinguishing NAFLD patients (AUC = 0.981) and MI patients (AUC = 0.900). Immuno-infiltration analysis revealed significantly lower CD8+ T cells and higher neutrophil levels in patients with NAFLD and MI. CD8+ T cells and neutrophils were effective in distinguishing NAFLD/MI from healthy controls. Correlation analysis showed that THBS1 was positively correlated with CCR (chemokine receptor), MHC class (major histocompatibility complex class), neutrophils, parainflammation, and Tfh (follicular helper T cells), and negatively correlated with CD8+ T cells, cytolytic activity, and TIL (tumor-infiltrating lymphocytes) in NAFLD and MI patients. THBS1 emerged as a novel biomarker for diagnosing NAFLD/MI in comparison to healthy controls. The results indicate that CD8+ T cells and neutrophils could serve as inflammatory immune features for differentiating patients with NAFLD/MI from healthy individuals.

Introduction

Non-alcoholic fatty liver disease (NAFLD) is a major public health issue with a prevalence of 25%-30%1. It has been reported that the prevalence of NAFLD is high among patients with diabetes2. However, the significance of NAFLD in non-diabetic patients is not yet clear. Studies have suggested that NAFLD plays an independent role in the pathogenesis of atherosclerosis3,4. Additionally, a meta-analysis has shown that NAFLD is closely associated with coronary artery calcification, endothelial dysfunction, and atherosclerosis, and has emerged as an independent cardiovascular risk factor5. The connection between NAFLD and cardiovascular disease still requires further investigation.

Myocardial infarction (MI) is a catastrophic disease that threatens health and imposes a significant economic burden on patients and their families worldwide6. MI is also a major cause of death in patients with NAFLD. A clinical study published in the British Medical Journal has shown that the risk of myocardial infarction in NAFLD patients is 1.17 times greater than in non-NAFLD patients7,8. Some studies have identified molecular pathways, including inflammation, oxidative stress, and lipid metabolism, that contribute to NAFLD-related MI9,10,11. However, the underlying mechanisms linking NAFLD and MI remain unclear. It is crucial to identify novel biomarkers related to the prognosis of NAFLD and MI.

The increasing prevalence of NAFLD, which affects a broad segment of the population, underscores a significant public health issue, particularly given its association with diabetes. However, the impact of NAFLD on non-diabetic patients remains poorly understood. NAFLD has been implicated in the pathogenesis of atherosclerosis and is recognized as an independent cardiovascular risk factor, being closely linked to coronary artery calcification, endothelial dysfunction, and atherosclerosis. Despite these associations, the precise mechanisms bridging NAFLD and cardiovascular diseases such as myocardial infarction (MI) require further elucidation. MI is one of the leading causes of mortality worldwide and imposes a significant economic burden. The risk of MI in NAFLD patients is notably higher than in those without NAFLD, highlighting the need for a deeper understanding of the molecular pathways connecting these conditions. While inflammation, oxidative stress, and lipid metabolism have been suggested as contributing factors, the exact mechanisms remain unclear. There is an urgent need to identify novel biomarkers that can provide insights into the prognosis and management of NAFLD-related MI.

Consequently, in this study, RNA microarray datasets for NAFLD and MI were downloaded from the National Center for Biotechnology Information-Gene Expression Omnibus (NCBI-GEO, see Table of Materials) to identify and analyze the interaction of differentially expressed genes (DEGs) between NAFLD and MI. Enrichment analysis, protein-protein interaction (PPI) network diagram construction, support vector machine-recursive feature elimination (SVM-RFE), and least absolute shrinkage and selection operator (LASSO) algorithms were used to identify hub genes12,13,14,15,16,17,18,19. Immuno-infiltration analysis was performed to examine immune cells in NAFLD and MI patients. Ultimately, these methods were integrated to elucidate the relationship between NAFLD and MI. Figure 1 illustrates the design sequence followed in this study. By combining bioinformatics, machine learning, and immuno-infiltration analysis, this study aims to contribute to the development of a novel medical decision support platform.

The key contributions of this article are: (1) Identification of co-expressed genes: The study highlights the relationship between NAFLD and MI by identifying co-expressed genes, offering a deeper understanding of the molecular links between these two conditions. (2) Application of bioinformatics and machine learning: Using bioinformatics and machine learning techniques, including support vector machine-recursive feature elimination (SVM-RFE)17 and least absolute shrinkage and selection operator (LASSO)19, the study identifies THBS1 as a differentially expressed gene. THBS1 demonstrates high performance in distinguishing NAFLD and MI patients from healthy controls. (3) Immuno-infiltration analysis: The study conducts an immuno-infiltration analysis, revealing significantly lower levels of CD8+ T cells and higher levels of neutrophils in patients with NAFLD and MI. (4) Correlation analysis: The research demonstrates that THBS1 is positively correlated with several immune factors, including CCR (chemokine receptor), MHC class I (major histocompatibility complex class I), neutrophils, parainflammation, and Tfh (follicular helper T) cells. It is negatively correlated with CD8+ T cells, cytolytic activity, and tumor-infiltrating lymphocytes (TIL).

Protocol

The details of the databases, weblinks, and software/packages used are listed in the Table of Materials. The simulation parameters used are provided in Table 1.

1. Obtaining RNA microarray datasets

  1. Download the myocardial infarction (MI) dataset, GSE66360, from the NCBI-GEO database.
  2. Download the non-alcoholic fatty liver disease (NAFLD) dataset, GSE89632, from the NCBI-GEO database.
  3. Ensure that the downloaded datasets include 49 MI samples, 50 healthy controls for GSE66360, 39 NAFLD samples, and 24 healthy controls for GSE89632.

2. Identification of DEGs

  1. Data preprocessing and DEG identification
    1. Load the datasets into RStudio using appropriate R functions.
    2. Utilize the R package limma to identify DEGs12 with the following thresholds: P < 0.05P and | log FC | > 1.5.
    3. Use the R packages pheatmap, ggplot2, and Venn to generate heatmaps and Venn diagrams for visualizing DEGs.
  2. Create visualizations
    1. Use pheatmap to generate heatmaps of DEGs.
    2. Use ggplot2 to create Venn diagrams showing the overlap of DEGs between NAFLD and MI.

3. Enrichment analysis

  1. Perform GO, KEGG, and DO enrichment analysis
    1. Load DEGs into the R package clusterProfiler13.
    2. Conduct Gene Ontology (GO) enrichment analysis to categorize DEGs into molecular functions, biological processes, and cellular components.
    3. Perform Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis to map DEGs onto biochemical pathways.
    4. Use Disease Ontology (DO) enrichment analysis to link DEGs to specific clinical states.

4. PPI analysis by constructing PPI networks

  1. Use the STRING online platform14 to build protein-protein interaction (PPI) network diagrams of co-DEGs with a confidence score of 0.4.
  2. Visualize the PPI networks using Cytoscape15.

5. Candidate hub DEGs screening by applying machine learning algorithms

  1. Use SVM-RFE (Support Vector Machine-Recursive Feature Elimination) and LASSO (Least Absolute Shrinkage and Selection Operator) regression algorithms in RStudio to select the most relevant DEGs16,17,18.
  2. Ensure that SVM-RFE ranks and removes features iteratively, while LASSO regression19 applies regularization to identify a sparse set of DEGs.

6. ROC curve construction for assessing diagnostic performance

  1. Use RStudio to conduct ROC (Receiver Operating Characteristic) curve analysis20.
  2. Calculate the area under the curve (AUC) for crucial co-DEGs.

7. Immuno-infiltration analysis to explore immune infiltration

  1. Perform single-sample Gene Set Enrichment Analysis (ssGSEA) using the R packages GSVA and GSEABase to analyze immune infiltration in NAFLD and MI21.
  2. Compare the relative abundance of immune cell types between NAFLD and MI samples and healthy controls.

8. Statistical analysis

  1. Perform all statistical analyses in RStudio.
  2. Apply Pearson correlation analysis to examine gene co-expression patterns.
  3. Ensure statistical significance with P < 0.05.
    NOTE: Ensure that all software and R packages are properly installed and updated to their latest versions before starting the protocol.

Results

The key findings of the proposed study are presented here, encompassing various analyses conducted to elucidate the molecular mechanisms underlying NAFLD and MI.

Identification of DEGs
In the GSE89632 dataset, 76 up-regulated and 20 down-regulated genes were identified as NAFLD-DEGs (Figure 2B,D), while the GSE66360 dataset revealed 118 up-regulated and 8 down-regulated genes as MI-DEGs (Figure 2C

Discussion

The method described in this study has significant implications for research into the molecular mechanisms underlying NAFLD and MI. By identifying key biomarkers such as THBS1, the proposed protocol offers potential targets for both diagnostic and therapeutic interventions. This approach can be extended to other complex diseases involving multiple pathways and immune responses, facilitating the discovery of novel biomarkers and therapeutic targets. Moreover, the integration of bioinformatics and machine learning techniqu...

Disclosures

None.

Acknowledgements

This study was supported by the National Natural Science Foundation of China (No. 62271511, U21A200949), Yucai Foundation of General Hospital of the Southern Theatre Command (2022NZC011), the Guangzhou Science and Technology Program Project (2023A03J0170), the National Clinical Research Center for Geriatrics (NCRCG-PLAGH-2023006) and Guangdong Basic and Applied Basic Research Foundation (No.2020A1515010288, No.2021A1515220101).

Materials

NameCompanyCatalog NumberComments
CytoscapeCytoscape ConsortiumVersion 3.6.1Used for visualizing protein-protein interaction (PPI) networks
MI dataset GSE66360 (https://www.ncbi.nlm.nih.gov/geo/).NCBI-GEO database -To collect RNA microarray datasets for analysis
R package clusterProfilerBioconductor -Used for GO, KEGG, and DO enrichment analyses
R package ggplot2CRAN -Used for creating Venn diagrams and other visualizations
R package GSEABaseBioconductor -Used in conjunction with GSVA for gene set enrichment analysis
R package GSVABioconductor -Used for single-sample gene set enrichment analysis (ssGSEA)
R package limmaBioconductor -Used for identifying differentially expressed genes (DEGs)
R package pheatmapCRAN -Used for generating heatmaps
R package vennCRAN -Used for creating Venn diagrams
RNA microarray datasets (GSE66360, GSE89632)NCBI-GEO -Publicly available RNA microarray datasets used for analysis
RStudioRStudio, PBCVersion 1.4.1717Integrated development environment for R
String databaseSTRING (www.string-db.org/) -Online tool for constructing PPI networks

References

  1. de Alwis, N. M. W., Day, C. P. Non-alcoholic fatty liver disease: The mist gradually clears. J Hepatol. 48 (Suppl), S104-S112 (2008).
  2. Adams, L. A., Anstee, Q. M., Tilg, H., Targher, G. Non-alcoholic fatty liver disease and its relationship with cardiovascular disease and other extrahepatic diseases. Gut. 66 (6), 1138-1153 (2017).
  3. Bhatia, L. S., Curzen, N. P., Calder, P. C., Byrne, C. D. Non-alcoholic fatty liver disease: A new and important cardiovascular risk factor. Eur Heart J. 33 (9), 1190-1200 (2012).
  4. Boddi, M., et al. Non-alcoholic fatty liver in non-diabetic patients with acute coronary syndromes. Eur J Clin Invest. 43 (4), 429-438 (2013).
  5. Oni, E. T., et al. A systematic review: Burden and severity of subclinical cardiovascular disease among those with non-alcoholic fatty liver; should we care. Atherosclerosis. 230 (2), 258-267 (2013).
  6. Reed, G. W., Rossi, J. E., Cannon, C. P. Acute myocardial infarction. Lancet. 389 (10070), 197-210 (2017).
  7. Stahl, E. P., et al. Non-alcoholic fatty liver disease and the heart: JACC state-of-the-art review. J Am Coll Cardiol. 73 (8), 948-963 (2019).
  8. Alexander, M., et al. Non-alcoholic fatty liver disease and risk of incident acute myocardial infarction and stroke: findings from matched cohort study of 18 million European adults. BMJ. 367, 1934578X221093907 (2019).
  9. Cobbina, E., Akhlaghi, F. Non-alcoholic fatty liver disease (NAFLD)-pathogenesis, classification, and effect on drug metabolizing enzymes and transporters. Drug Metab Rev. 49 (2), 197-211 (2017).
  10. Pawlak, M., Lefebvre, P., Staels, B. Molecular mechanism of PPARα action and its impact on lipid metabolism, inflammation and fibrosis in non-alcoholic fatty liver disease. J Hepatol. 62 (3), 720-733 (2015).
  11. Katsiki, N., Mikhailidis, D. P., Mantzoros, C. S. Non-alcoholic fatty liver disease and dyslipidemia: An update. Metabolism. 65 (8), 1109-1123 (2016).
  12. Ritchie, M. E., et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43 (7), e47 (2015).
  13. Yu, G., Wang, L. G., Han, Y., He, Q. Y. clusterProfiler: An R package for comparing biological themes among gene clusters. Omics. 16 (5), 284-287 (2012).
  14. Szklarczyk, D., et al. The STRING database in 2021: Customizable protein-protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res. 49 (D1), D605-D612 (2021).
  15. Sriroopreddy, R., Sajeed, R., Raghuraman, P., Sudandiradoss, C. Differentially expressed gene (DEG) based protein-protein interaction (PPI) network identifies a spectrum of gene interactome, transcriptome and correlated miRNA in nondisjunction Down syndrome. Int J Biol Macromol. 122, 1080-1089 (2019).
  16. Engebretsen, S., Bohlin, J. Statistical predictions with glmnet. Clin Epigenetics. 11, 123 (2019).
  17. Ma, H., Ding, F., Wang, Y. A novel multi-innovation gradient support vector machine regression method. ISA Trans. 130, 343-359 (2022).
  18. Han, Y., Huang, L., Zhou, F. A dynamic recursive feature elimination framework (dRFE) to further refine a set of OMIC biomarkers. Bioinformatics. 37 (12), 2183-2189 (2021).
  19. Colombani, C., et al. Application of Bayesian least absolute shrinkage and selection operator (LASSO) and BayesCπ methods for genomic selection in French Holstein and Montbéliarde breeds. J Dairy Sci. 96 (2), 575-591 (2013).
  20. Gamper, M., et al. Gene expression profile of bladder tissue of patients with ulcerative interstitial cystitis. BMC Genomics. 10 (1), 1-17 (2009).
  21. Hänzelmann, S., Castelo, R., Guinney, J. GSVA: Gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 14, 1-15 (2013).
  22. Valbusa, F., et al. Non-alcoholic fatty liver disease and increased risk of all-cause mortality in elderly patients admitted for acute heart failure. Int J Cardiol. 265, 162-168 (2018).
  23. Zhang, Q., et al. Identification of hub biomarkers of myocardial infarction by single-cell sequencing, bioinformatics, and machine learning. Front Cardiovasc Med. 9, 939972 (2022).
  24. Zhang, Q., et al. 7-Hydroxyflavone alleviates myocardial ischemia/reperfusion injury in rats by regulating inflammation. Molecules. 27 (17), 5371 (2022).
  25. Zhang, Q., Guo, Y., Zhang, D. Network pharmacology integrated with molecular docking elucidates the mechanism of wuwei yuganzi san for the treatment of coronary heart disease. Nat Prod Commun. 17 (1), 1934578X221093907 (2022).
  26. Baenziger, N. L., Brodie, G., Majerus, P. W. A thrombin-sensitive protein of human platelet membranes. Proc Natl Acad Sci USA. 68 (2), 240-243 (1971).
  27. Zhang, N., Aiyasiding, X., Li, W. J., Liao, H. H., Tang, Q. Z. Neutrophil degranulation and myocardial infarction. Cell Commun Signal. 20 (1), 50 (2022).
  28. Lawler, J. Thrombospondin-1 as an endogenous inhibitor of angiogenesis and tumor growth. J Cell Mol Med. 6 (1), 1-12 (2002).
  29. Kaur, S., et al. Functions of thrombospondin-1 in the tumor microenvironment. Int J Mol Sci. 22 (9), 4570 (2021).
  30. Asch, A. S., et al. Isolation of the thrombospondin membrane receptor. J Clin Invest. 79 (4), 1054-1061 (1987).
  31. Armstrong, L. C., Bornstein, P. Thrombospondins 1 and 2 function as inhibitors of angiogenesis. Matrix Biol. 22 (1), 63-71 (2003).
  32. Roberts, D. D., Isenberg, J. S. CD47 and thrombospondin-1 regulation of mitochondria, metabolism, and diabetes. Am J Physiol Cell Physiol. 321 (1), C201-C213 (2021).
  33. Emre, A., et al. Impact of non-alcoholic fatty liver disease on myocardial perfusion in non-diabetic patients undergoing primary percutaneous coronary intervention for ST-segment elevation myocardial infarction. Am J Cardiol. 116 (11), 1810-1814 (2015).
  34. Mouton, A. J., et al. Fibroblast polarization over the myocardial infarction time continuum shifts roles from inflammation to angiogenesis. Basic Res Cardiol. 114 (1), 1-16 (2019).
  35. Liu, Y., et al. Atherosclerotic conditions promote the packaging of functional microRNA-92a-3p into endothelial microvesicles. Circ Res. 124 (4), 575-587 (2019).
  36. Bai, J., et al. Thrombospondin 1 improves hepatic steatosis in diet-induced insulin-resistant mice and is associated with hepatic fat content in humans. EBioMedicine. 57 (1), 102-111 (2020).
  37. Wabitsch, S., et al. Metformin treatment rescues CD8+ T-cell response to immune checkpoint inhibitor therapy in mice with NAFLD. J Hepatol. 77 (4), 748-760 (2022).
  38. Dolejsi, T., et al. Adult T-cells impair neonatal cardiac regeneration. Eur Heart J. 43 (27), 2698-2709 (2022).
  39. Qin, Z., et al. Systemic immune-inflammation index is associated with increased urinary albumin excretion: a population-based study. Front Immunol. 13, 863640 (2022).

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Explore More Articles

Medical Decision Support PlatformThrombospondin 1Non alcoholic Fatty Liver DiseaseMyocardial InfarctionBioinformaticsMachine LearningCo expressed GenesBiomarkersFunctional Enrichment AnalysisProtein protein Interaction NetworkSVM RFELASSOCD8 T CellsNeutrophilsImmuno infiltration AnalysisCorrelation Analysis

This article has been published

Video Coming Soon

JoVE Logo

Privacy

Terms of Use

Policies

Research

Education

ABOUT JoVE

Copyright © 2025 MyJoVE Corporation. All rights reserved