JoVE Logo
Faculty Resource Center

Sign In

Summary

Abstract

Introduction

Protocol

Representative Results

Discussion

Acknowledgements

Materials

References

Cancer Research

Three Differential Expression Analysis Methods for RNA Sequencing: limma, EdgeR, DESeq2

Published: September 18th, 2021

DOI:

10.3791/62528

1Department of Obstetrics and Gynecology, Renmin Hospital of Wuhan University, 2Department of Pathology, Shanghai Skin Disease Hospital, Tongji University School of Medicine
* These authors contributed equally

A detailed protocol of differential expression analysis methods for RNA sequencing was provided: limma, EdgeR, DESeq2.

RNA sequencing (RNA-seq) is one of the most widely used technologies in transcriptomics as it can reveal the relationship between the genetic alteration and complex biological processes and has great value in diagnostics, prognostics, and therapeutics of tumors. Differential analysis of RNA-seq data is crucial to identify aberrant transcriptions, and limma, EdgeR and DESeq2 are efficient tools for differential analysis. However, RNA-seq differential analysis requires certain skills with R language and the ability to choose an appropriate method, which is lacking in the curriculum of medical education.

Herein, we provide the detailed protocol to identify differentially expressed genes (DEGs) between cholangiocarcinoma (CHOL) and normal tissues through limma, DESeq2 and EdgeR, respectively, and the results are shown in volcano plots and Venn diagrams. The three protocols of limma, DESeq2 and EdgeR are similar but have different steps among the processes of the analysis. For example, a linear model is used for statistics in limma, while the negative binomial distribution is used in edgeR and DESeq2. Additionally, the normalized RNA-seq count data is necessary for EdgeR and limma but is not necessary for DESeq2.

Here, we provide a detailed protocol for three differential analysis methods: limma, EdgeR and DESeq2. The results of the three methods are partly overlapping. All three methods have their own advantages, and the choice of method only depends on the data.

RNA-sequencing (RNA-seq) is one of the most widely used technologies in transcriptomics with many advantages (e.g., high data reproducibility), and has dramatically increased our understanding of the functions and dynamics of complex biological processes1,2. Identification of aberrate transcripts under different biological context, which are also known as differentially expressed genes (DEGs), is a key step in RNA-seq analysis. RNA-seq makes it possible to get a deep understanding of pathogenesis related molecular mechanisms and biological functions. Therefore, differential analysis has been regarded as valuab....

Log in or to access full content. Learn more about your institution’s access to JoVE content here

NOTE: Open the R-studio program and load R file "DEGs.R", the file can be acquired from Supplementary files/Scripts.

1. Downloading and pre-processing of data

  1. Download the high-throughput sequencing (HTSeq) count data of cholangiocarcinoma (CHOL) from The Cancer Genome Atlas (TCGA). This step can be easily achieved by the following R code.
    1. Click Run to install R packages.
    2. Click Run to load R packages.
      if(!requ.......

Log in or to access full content. Learn more about your institution’s access to JoVE content here

There are various approaches to visualize the result of differential expression analysis, among which the volcano plot and Venn diagram are particularly used. limma identified 3323 DEGs between the CHOL and normal tissues with the |logFC|≥2 and adj.P.Val <0.05 as thresholds, among which 1880 were down-regulated in CHOL tissues and 1443 were up-regulated (Figure 1a). Meanwhile, edgeR identified the 1578 down-regulated DEGs and 3121 up-regulated DEGs (Figure 1b

Log in or to access full content. Learn more about your institution’s access to JoVE content here

Abundant aberrate transcripts in cancers can be easily identified by RNA-seq differential analysis5. However, the application of RNA-seq differential expression analysis is often restricted as it requires certain skills with R language and the capacity to choose appropriate methods. To address this problem, we provide a detailed introduction to the three most known methods (limma, EdgeR and DESeq2) and tutorials for applying the RNA-seq differential expression analysis. This will facilitate the un.......

Log in or to access full content. Learn more about your institution’s access to JoVE content here

This work was supported by the National Natural Science Foundation of China (Grant No. 81860276) and Key Special Fund Projects of National Key R&D Program (Grant No. 2018YFC1003200).

....

Log in or to access full content. Learn more about your institution’s access to JoVE content here

Name Company Catalog Number Comments
R version 3.6.2 free software
Rstudio free software

  1. Tambonis, T., Boareto, M., Leite, V. B. P. Differential Expression Analysis in RNA-seq Data Using a Geometric Approach. Journal of Computational Biology. 25, 1257-1265 (2018).
  2. Wang, Z., Gerstein, M., Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews. Genetics. 10, 57-63 (2009).
  3. Anders, S., et al. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols. 8, 1765-1786 (2013).
  4. McDermaid, A., Monier, B., Zhao, J., Liu, B., Ma, Q. Interpretation of differential gene expression results of RNA-seq data: review and integration. Briefings in Bioinformatics. 20, 2044-2054 (2019).
  5. Costa-Silva, J., Domingues, D., Lopes, F. M. RNA-Seq differential expression analysis: An extended review and a software tool. PloS One. 12, 0190152 (2017).
  6. Law, C. W., et al. RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR. F1000Research. 5, (2016).
  7. Varet, H., Brillet-Guéguen, L., Coppée, J. Y., Dillies, M. A. SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data. PloS One. 11, 0157022 (2016).
  8. Ritchie, M. E., et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Research. 43, 47 (2015).
  9. Robinson, M. D., McCarthy, D. J., Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 26, 139-140 (2010).
  10. Love, M. I., Huber, W., Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology. 15, 550 (2014).
  11. Gentleman, R. C., et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biology. 5, 80 (2004).
  12. Law, C. W., Chen, Y., Shi, W., Smyth, G. K. voom: Precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology. 15, 29 (2014).
  13. Smyth, G. K. Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology. 3, (2004).
  14. Lund, S. P., Nettleton, D., McCarthy, D. J., Smyth, G. K. Detecting differential expression in RNA-sequence data using quasi-likelihood with shrunken dispersion estimates. Statistical Applications in Genetics and Molecular Biology. 11, (2012).
  15. Reeb, P. D., Steibel, J. P. Evaluating statistical analysis models for RNA sequencing experiments. Frontiers in Genetics. 4, 178 (2013).
  16. Rocke, D. M., et al. Excess False Positive Rates in Methods for Differential Gene Expression Analysis using RNA-Seq Data. bioRxiv. , (2015).
  17. Agarwal, A., et al. Comparison and calibration of transcriptome data from RNA-Seq and tiling arrays. BMC genomics. 11, 383 (2010).
  18. Leng, N., et al. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics. 29, 1035-1043 (2013).

This article has been published

Video Coming Soon

JoVE Logo

Privacy

Terms of Use

Policies

Research

Education

ABOUT JoVE

Copyright © 2024 MyJoVE Corporation. All rights reserved