A subscription to JoVE is required to view this content. Sign in or start your free trial.
The protocol submitted here explains the complete in silico pipeline needed to predict and functionally characterize circRNAs from RNA-sequencing transcriptome data studying host-pathogen interactions.
Circular RNAs (circRNAs) are a class of non-coding RNAs that are formed via back-splicing. These circRNAs are predominantly studied for their roles as regulators of various biological processes. Notably, emerging evidence demonstrates that host circRNAs can be differentially expressed (DE) upon infection with pathogens (e.g., influenza and coronaviruses), suggesting a role for circRNAs in regulating host innate immune responses. However, investigations on the role of circRNAs during pathogenic infections are limited by the knowledge and skills required to carry out the necessary bioinformatic analysis to identify DE circRNAs from RNA sequencing (RNA-seq) data. Bioinformatics prediction and identification of circRNAs is crucial before any verification, and functional studies using costly and time-consuming wet-lab techniques. To solve this issue, a step-by-step protocol of in silico prediction and characterization of circRNAs using RNA-seq data is provided in this manuscript. The protocol can be divided into four steps: 1) Prediction and quantification of DE circRNAs via the CIRIquant pipeline; 2) Annotation via circBase and characterization of DE circRNAs; 3) CircRNA-miRNA interaction prediction through Circr pipeline; 4) functional enrichment analysis of circRNA parental genes using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG). This pipeline will be useful in driving future in vitro and in vivo research to further unravel the role of circRNAs in host-pathogen interactions.
Host-pathogen interactions represent a complex interplay between the pathogens and host organisms, which triggers the hosts' innate immune responses that eventually result in the removal of invading pathogens1,2. During pathogenic infections, a multitude of the host immune genes is regulated to inhibit the replication and release of pathogens. For example, common interferon-stimulated genes (ISGs) regulated upon pathogenic infections include ADAR1, IFIT1, IFIT2, IFIT3, ISG20, RIG-I, and OASL3,4. Besides protein-coding genes, studies have also ....
In this protocol, de-identified ribosomal RNA (rRNA)-depleted RNA-seq library datasets prepared from the influenza A virus-infected human macrophage cells were downloaded and used from the Gene Expression Omnibus (GEO) database. The entire bioinformatics pipeline from prediction to functional characterization of circRNAs is summarized in Figure 1. Each part of the pipeline is further explained in the sections below.
1. Preparation, download, and setup before.......
The protocol enlisted in the previous section was modified and configured to suit the Linux OS system. The main reason is that most module libraries and packages involved in the analysis of circRNAs can only work on the Linux platform. In this analysis, de-identified ribosomal RNA (rRNA)-depleted RNA-seq library datasets prepared from the Influenza A virus-infected human macrophage cells were downloaded from the GEO database42 and used to generate the representative results.
To illustrate the utility of this protocol, RNA-seq from influenza A virus-infected human macrophage cells was used as an example. CircRNAs functioning as potential miRNA sponges in host-pathogen interactions and their GO and KEGG functional enrichment within a host were investigated. Although there are a variety of circRNA tools available online, each of them is a standalone package that does not interact with one another. Here, we put together few of the tools that are required for circRNA prediction and quantification.......
The author would like to thank Tan Ke En and Dr. Cameron Bracken for their critical review of this manuscript. This work was supported by grants from Fundamental Research Grant Scheme (FRGS/1/2020/SKK0/UM/02/15) and University of Malaya High Impact Research Grant (UM.C/625/1/HIR/MOE/CHAN/02/07).
....Name | Company | Catalog Number | Comments |
Bedtools | GitHub | https://github.com/arq5x/bedtools2/ | Referring to section 4.1.2. Needed for Circr. |
BWA | Burrows-Wheeler Aligner | http://bio-bwa.sourceforge.net/ | Referring to section 2.1.1 and 2.1.2. Needed to run CIRIquant, and to index the genome |
Circr | GitHub | https://github.com/bicciatolab/Circr | Referring to section 4. Use to predict the miRNA binding sites |
CIRIquant | GitHub | https://github.com/bioinfo-biols/CIRIquant | Referring to section 2.1.3. To predict circRNAs |
Clusterprofiler | GitHub | https://github.com/YuLab-SMU/clusterProfiler | Referring to section 7. For GO and KEGG functional enrichment |
CPU | Intel | Intel(R) Xeon(R) CPU E5-2620 V2 @ 2.10 GHz Cores: 6-core CPU Memory: 65 GB Graphics card: NVIDIA GK107GL (QUADRO K2000) | Specifications used to run this entire protocol. |
Cytoscape | Cytoscape | https://cytoscape.org/download.html | Referring to section 5.2. Needed to plot ceRNA network |
FastQC | Babraham Bioinformatics | https://www.bioinformatics.babraham.ac.uk/projects/fastqc/ | Referring to section 1.2.1. Quality checking on Fastq files |
HISAT2 | http://daehwankimlab.github.io/hisat2/ | Referring to section 2.1.1 and 2.1.2. Needed to run CIRIquant, and to index the genome | |
Linux | Ubuntu 20.04.5 LTS (Focal Fossa) | https://releases.ubuntu.com/focal/ | Needed to run the entire protocol. Other Ubuntu versions may still be valid to carry out the protocol. |
miRanda | http://www.microrna.org/microrna/getDownloads.do | Referring to section 4.1.2. Needed for Circr | |
Pybedtools | pybedtools 0.8.2 | https://pypi.org/project/pybedtools/ | Needed for BED file genomic manipulation |
Python | Python 2.7 and 3.6 or abover | https://www.python.org/downloads/ | To run necessary library modules |
R | The Comprehensive R Archive Network | https://cran.r-project.org/ | To manipulate dataframes |
RNAhybrid | BiBiServ | https://bibiserv.cebitec.uni-bielefeld.de/rnahybrid | Referring to section 4.1.2. Needed for Circr |
RStudio | RStudio | https://www.rstudio.com/ | A workspace to run R |
samtools | SAMtools | http://www.htslib.org/ | Referring to section 2.1.2. Needed to run CIRIquant |
StringTie | Johns Hopkins University: Center for Computational Biology | http://ccb.jhu.edu/software/stringtie/index.shtml | Referring to section 2.1.2. Needed to run CIRIquant |
TargetScan | GitHub | https://github.com/nsoranzo/targetscan | Referring to section 4.1.2. Needed for Circr |
This article has been published
Video Coming Soon
ABOUT JoVE
Copyright © 2024 MyJoVE Corporation. All rights reserved