Name: computational analysis tutorial for chimeric small noncoding rna: target rna sequencing libraries
Uploaded: 2023-12-01T14:40:00.000+00:00
Duration: 7 min 35 s
Description: Watch this Scientific Journal Video about A Computational Pipeline for Analyzing Chimeric Noncoding RNA-Target RNA Interactions in High-Throughput Sequencing Data at JoVE.com

We thank members of the Meffert laboratory for helpful discussions, including BH Powell and WT Mills IV, for critical feedback on describing the installation and implementation of the pipeline. This work was supported by a Braude Foundation award, the Maryland Stem Cell Research Fund Launch Program, the Blaustein Endowment for Pain Research and Education award, and NINDS RO1NS103974 and NIMH RO1MH129292 to M.K.M.

NOTE: The protocol will begin with downloading and installing software required to analyze chimeric RNA sequencing libraries using SCRAP.
1. Installation
<ol>
	<li>Before installing SCRAP, install the dependencies Git and Miniconda on the machine to be used for the analyses. Git is likely already installed. On the Mac OSX platform, for example, verify this using which git to see that the &#34;git&#34; utility is present and installed in this directory. Check if Miniconda is installed using which conda. If nothing is returned, install Miniconda. Miniconda requires 400 MB of disk space to install.
	<ol>
		<li>There are a few methods to install Miniconda, and they differ by platform. Refer to the PLATFORM-SETUP markdown file on the Meffert Lab GitHub repository [https://github.com/Meffert-Lab/SCRAP/blob/main/PLATFORM-SETUP.md] where there are further instructions for installing on Windows, MacOS, and Ubuntu. For Linux users, Linux has its own default package manager (apt). In the case specific to this study, use the command&#160;brew install Miniconda to install Miniconda using an existing package manager, brew. 
		NOTE: &#39;Homebrew&#39;, termed &#39;brew&#39; is an open-source software package management system that simplifies the installation of software on Apple&#39;s operating system, macOS.</li>
		<li>If conda is being installed for the first time, run conda init for the particular shell that is in use. In the example here, that shell in use is zsh. Then, close and re-open the shell. If conda was successfully installed, the base environment activated within the terminal session will be seen.</li>
	</ol>
	</li>
	<li>Download the SCRAP source and install its dependencies.
	<ol>
		<li>The preferred method for obtaining SCRAP source is using Git. Access this by running git clone&#160;https://github.com/Meffert-Lab/SCRAP to obtain the latest copy of the source code.</li>
		<li>Install mamba, an improved package solver for conda, and install all the dependencies for SCRAP from SCRAP_environment.yml to its own conda environment using the following commands: 
		conda install -n base conda-forge::mamba 
		mamba env create -f SCRAP/SCRAP_environment.yml -n SCRAP</li>
	</ol>
	</li>
	<li>Next, run the reference installation for SCRAP. The arguments used in the reference installation will be specific to the organism whose sncRNA-mRNA interactions are being analyzed. 
	bash SCRAP/bin/Reference_Installation.sh -r full/path/to/SCRAP/ -m hsa -g hg38 -s human
	<ol>
		<li>Provide the directory of the SCRAP source folder for reference installation. Installation steps will then be performed using the files within the fasta and annotation folders. List the full path without any shorthand. End with a slash.</li>
		<li>Refer to the tables in README.md for the correct miRbase species abbreviations. The up-to-date reference genomes can be found at https://genome.ucsc.edu/ or https://www.ncbi.nlm.nih.gov/data-hub/genome/. In this example,&#160;hg38 will be used for the mouse GRCm38 genome.</li>
		<li>The currently included species for annotation are human, mouse, and worm. View the corresponding species.annotation.bed files in the annotation directory in the SCRAP source folder. If the use of a different species for analysis is desired, provide an annotation.bed file that follows the same naming scheme species.annotation.bed.</li>
	</ol>
	</li>
</ol>
2. Running SCRAP
<ol>
	<li>Now that the dependencies and SCRAP are installed, - run the script SCRAP.sh 
	bash SCRAP/bin/SCRAP.sh -d full/path/to/CLASH_Human/ -a full/path/to/CLASH_Human/CLASH_Human_Adapters.txt -p no -f yes -r full/path/to/SCRAP/ -m hsa -g hg38
	<ol>
		<li>List the entire path to the sample directories without any shorthand. Format the sample directories with the folder name matching the sample name exactly, as shown in Figure 1.</li>
		<li>Note that the path listed is the path to the directory that contains all the sample folders, not the path to any individual sample folder or a sample file (refer to the command line in step 2.1).</li>
		<li>Next, list the entire path to the adapter file. Ensure that the sample names in the adapter file match the previously mentioned folder names and file names (refer to the command line in step 2.1).</li>
		<li>Indicate whether the samples are paired-end and whether or not filtering for pre-miRNAs and/or tRNAs will be performed. Add a filter for rRNA cleaning if desired (refer to the command line in step 2.1). 
		NOTE: The users may or may not decide to use these filters depending on the sample types and experimental goals. Depending upon the experimental design, pre-miRNAs, tRNAs, and rRNAs can consume available sequencing depth for real sncRNA:target RNA chimeras and users can employ filters to exclude them. However, users may want to avoid such filtering in certain circumstances (e.g., mapping sncRNA targets to the mitochondrial genome, which contains mitochondrial rRNAs).</li>
		<li>Next, list the entire path to the reference directory, the miRbase abbreviation, and the reference genome abbreviation (refer to the command line in step 2.1). 
		&#8203;NOTE: The script may take a few hours to complete, depending on the dataset size and the CPU of the computer being used.</li>
	</ol>
	</li>
</ol>
3. Peak calling and annotation
<ol>
	<li>Once SCRAP is finished running, check that the output includes, among other files, a SAMPLE.aligned.unique.bam file. This is a binary file containing alignments of target RNAs onto the user-provided reference genome.</li>
	<li>Now perform peak calling by running Peak_Calling.sh. 
	bash SCRAP/bin/Peak_Calling.sh -d CLASH_Human/ -a CLASH_Human/CLASH_Human_Adapters.txt -c 3 -l 2 -f no -r SCRAP/ -m hsa -g hg38 
	NOTE: Peak calling is a feature of SCRAP, which is designed to allow researchers to readily evaluate the most robust and reproducible small noncoding RNA:target RNA interactions within their chimeric RNA libraries. This feature, for example, can aid researchers in identifying interactions that they may want to select for further investigation. Step 3.2.2 below describes how the user sets the criteria which they want to be used to define the stringency with which a peak is called - this includes the number of unique interactions, or sequencing reads, which must have occurred for the peak to be called, as well as the number of libraries in which this particular interaction must have occurred.
	<ol>
		<li>Again, list the full paths to the directory containing the sample folders, and the adapter file (refer to the command line in step 3.2).</li>
		<li>Next, set the minimum number of sequencing reads required for a peak to be called (refer to the command line in step 3.2).</li>
		<li>Set the minimum number of distinct sequencing libraries that must contain a peak for it to be called (refer to the command line in step 3.2). 
		NOTE: The choice of values for both 3.2.2 and 3.2.3 will depend upon the nature of the samples sequenced and the number of samples or sample types. Here, at least 3 chimeric sequencing reads in a sample are required to call a peak, and the peak must be supported by at least 2 samples. An investigator evaluating a dataset in which there are many sequencing library replicates for a given condition, for example, might decide to require the presence of the reads in a greater number of sample sequencing libraries.</li>
		<li>Indicate whether sncRNAs of the same family must contribute to the same peak. For example, since miRNAs of the same family share seed sequences, these miRNAs can bind shared and overlapping sets of gene targets; a user might want to identify the full impact of a family on these targets by assessing their collective peaks(refer to the command line in step 3.2).</li>
		<li>Next, indicate the full path to the reference directory, the miRBase abbreviation, and the reference genome abbreviation (refer to the command line in step 3.2).</li>
	</ol>
	</li>
	<li>Once peak calling is complete, run peak annotation. 
	&#8203;bash SCRAP/bin/Peak_Annotation.sh -p CLASH_Human/peaks.bed -r SCRAP/ -s human
	<ol>
		<li>List the full path to the resulting peaks.bed (or peaks.family.bed) file from peak calling, the full path to the reference directory, and the desired species for annotation.</li>
	</ol>
	</li>
</ol>
4. Visualizing the data
NOTE: All steps for analysis using SCRAP are now completed. For visualizing the data, several approaches are recommended:
<ol>
	<li>Merge all the .bam (binary SAM file) files that will be desired to visualize together (samtools merge).</li>
	<li>Sort the resulting merged .bam file (samtools sort). File contents are sorted line by line so that samtools may index.</li>
	<li>Index the sorted .bam file (samtools index). A BAI (binary samtools format index) file is generated to permit visualization in the integrative genomics viewer (IGV).</li>
	<li>Finally, open the resulting sorted .bam and indexed .bai file in IGV. 
	NOTE: SncRNA:Target RNA interactions of interest may be prioritized for follow-up in a number of investigation-specific ways. One generic initial approach is to assess the interactions for which peaks are supported by the most chimeric sequencing reads. Interactions of interest may also be visualized using the DuplexFold Web Server from the RNAstructure package by inputting the sequence for both the sncRNA and the target RNA from the detected interaction11. For each peak, the chromosome (first column) and genomic coordinates (start: 1st column end: 2nd column) can be found within the peaks.bed.species.annotation.txt file generated in peak annotation. For miRNAs in particular, while reproducible and functional interactions can lack extensive seed-matched binding (e.g., interactions may use 3&#39; compensatory binding), the presence of seed-matched sites in a cognate binding motif of the target RNA can nonetheless be assessed as a validating feature of functionally important detected interactions4,12. Ancillary data processing could include comparisons of differential read coverage between peaks in distinct biological conditions and potentially assessment of clustering of regulated genes into pathways using a pathway analysis tool.</li>
</ol>

In Vivo RNA: RNA Interactions Using the SCRAP Bioinformatics Pipeline

<ol>
	<li>Morris, K. V., Mattick, J. S. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=The+rise+of+regulatory+RNA.">The rise of regulatory RNA.</a> Nature Reviews Genetics. 15 (6), 423-437 (2014).</li><li>Li, X., Jin, D. S., Eadara, S., Caterina, M. J., Meffert, M. K. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Regulation+by+noncoding+RNAs+of+local+translation,+injury+responses,+and+pain+in+the+peripheral+nervous+system.">Regulation by noncoding RNAs of local translation, injury responses, and pain in the peripheral nervous system.</a> Neurobiology of Pain (Cambridge, Mass.). 13, 100119 (2023).</li><li>Shi, J., Zhou, T., Chen, Q. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Exploring+the+expanding+universe+of+small+RNAs.">Exploring the expanding universe of small RNAs.</a> Nature Cell Biology. 24 (4), 415-423 (2022).</li><li>Broughton, J. P., Lovci, M. T., Huang, J. L., Yeo, G. W., Pasquinelli, A. E. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Pairing+beyond+the+seed+supports+microRNA+targeting+specificity.">Pairing beyond the seed supports microRNA targeting specificity.</a> Molecular Cell. 64 (2), 320-333 (2016).</li><li>Grosswendt, S., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Unambiguous+identification+of+miRNA:target+site+interactions+by+different+types+of+ligation+reactions.">Unambiguous identification of miRNA:target site interactions by different types of ligation reactions.</a> Molecular Cell. 54 (6), 1042-1054 (2014).</li><li>Mills, W. T., Eadara, S., Jaffe, A. E., Meffert, M. K. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=SCRAP:+a+bioinformatic+pipeline+for+the+analysis+of+small+chimeric+RNA-seq+data.">SCRAP: a bioinformatic pipeline for the analysis of small chimeric RNA-seq data.</a> RNA. 29 (1), 1-17 (2023).</li><li>Helwak, A., Kudla, G., Dudnakova, T., Tollervey, D. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Mapping+the+human+miRNA+interactome+by+CLASH+reveals+frequent+noncanonical+binding.">Mapping the human miRNA interactome by CLASH reveals frequent noncanonical binding.</a> Cell. 153 (3), 654-665 (2013).</li><li>Hoefert, J. E., Bjerke, G. A., Wang, D., Yi, R. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=The+microRNA-200+family+coordinately+regulates+cell+adhesion+and+proliferation+in+hair+morphogenesis.">The microRNA-200 family coordinately regulates cell adhesion and proliferation in hair morphogenesis.</a> Journal of Cell Biology. 217 (6), 2185-2204 (2018).</li><li>Moore, M. J., Zhang, C., Gantman, E. C., Mele, A., Darnell, J. C., Darnell, R. B. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Mapping+Argonaute+and+conventional+RNA-binding+protein+interactions+with+RNA+at+single-nucleotide+resolution+using+HITS-CLIP+and+CIMS+analysis.">Mapping Argonaute and conventional RNA-binding protein interactions with RNA at single-nucleotide resolution using HITS-CLIP and CIMS analysis.</a> Nature Protocols. 9 (2), 263-293 (2014).</li><li>Bjerke, G. A., Yi, R. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Integrated+analysis+of+directly+captured+microRNA+targets+reveals+the+impact+of+microRNAs+on+mammalian+transcriptome.">Integrated analysis of directly captured microRNA targets reveals the impact of microRNAs on mammalian transcriptome.</a> RNA. 26 (3), 306-323 (2020).</li><li>Reuter, J. S., Mathews, D. H. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=RNAstructure:+software+for+RNA+secondary+structure+prediction+and+analysis.">RNAstructure: software for RNA secondary structure prediction and analysis.</a> BMC Bioinformatics. 11 (1), 129 (2010).</li><li>Moore, M. J., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=miRNA-target+chimeras+reveal+miRNA+3&#8242;-end+pairing+as+a+major+determinant+of+Argonaute+target+specificity.">miRNA-target chimeras reveal miRNA 3&#8242;-end pairing as a major determinant of Argonaute target specificity.</a> Nature Communications. 6 (1), 8864 (2015).</li><li>Travis, A. J., Moody, J., Helwak, A., Tollervey, D., Kudla, G. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Hyb:+a+bioinformatics+pipeline+for+the+analysis+of+CLASH+(crosslinking,+ligation+and+sequencing+of+hybrids)+data.">Hyb: a bioinformatics pipeline for the analysis of CLASH (crosslinking, ligation and sequencing of hybrids) data.</a> Methods (San Diego, Calif.). 65 (3), 263-273 (2014).</li></ol>

The authors have nothing to disclose.

This protocol on the use of SCRAP pipeline for analysis of sncRNA:target RNA interactions is designed to assist investigators who are entering into computational analysis. Completion of the tutorial is expected to guide investigators with entry-level or greater computational experience through the steps required for installation and use of this pipeline and its application to analyze data gained from chimeric RNA sequencing libraries. Steps critical to the completion of this protocol include correct reference installatio...

Small noncoding RNAs are highly studied for their post-transcriptional roles in coordinating expression from suites of genes in diverse processes such as differentiation and development, signal processing, and disease1,2,3. The ability to accurately determine the target transcripts of gene-regulatory small noncoding RNAs (sncRNAs), including microRNAs (miRNAs), is of importance to studies of RNA biology at both basic and translational levels. Bioinformatic algorithms that exploit anticipated complementarity between the miRNA seed sequence and its potential targets have been frequently used for the prediction of miRNA:target RNA interactions. While these bioinformatic algorithms have been successful, they also can harbor both false positive and false negative results, as has been reviewed elsewhere4,5,6. Recently, several biochemical approaches have been designed and implemented that allow unambiguous and semiquantitative determination of in vivo sncRNA:target RNA interactions by in vivo crosslinking and ensuing incorporation of a ligation step to physically attach the sncRNA to its target to form a single chimeric RNA4,5,7,8,9,10. Subsequent preparation of sequencing libraries from the chimeric RNAs allows assessment of the sncRNA:target RNA interactions by computational processing of the sequencing data. This video provides a tutorial for installing and using a computational pipeline termed small chimeric RNA analysis pipeline (SCRAP), which is designed to allow robust and reproducible analysis of sncRNA:target RNA interactions from chimeric RNA sequencing libraries6.
A goal of this tutorial is to assist investigators in avoiding excessive reliance on purely predictive bioinformatic algorithms by lowering barriers to the analysis of data generated through biochemical approaches providing chimeric molecular readouts of sncRNA:target RNA interactions. This tutorial provides practical steps and tips to guide entry-level computational scientists through the use of a pipeline, SCRAP, developed for analyzing chimeric RNA sequencing data, which can be generated by several existing biochemical protocols, including crosslinking, ligation, and sequencing of hybrids (CLASH) and covalent ligation of endogenous Argonaute-bound RNAs- crosslinking and immunoprecipitation (CLEAR-CLIP)7,9.
The use of SCRAP offers several advantages for the analysis of chimeric RNA sequencing data, compared to other computational pipelines6. One salient advantage is its extensive annotation and the incorporation of call-outs to well-supported and routinely updated bioinformatic scripts within the pipeline, in comparison to alternative pipelines that often rely on custom and/or unsupported scripts for steps in the pipeline. This feature lends stability to SCRAP, making it more worthwhile for researchers to familiarize themselves with the pipeline and to incorporate its use into their workflow. SCRAP has also been demonstrated to outperform alternative pipelines in calling peaks of sncRNA:target RNA interactions and to have cross-platform functionality, as detailed in a prior publication6.
By the end of this tutorial, users will be able to (i) know platform requirements for SCRAP and install SCRAP pipelines, (ii) install reference genomes and set up command line parameters for SCRAP, and (iii) understand peak calling criteria and perform peak calling and peak annotation.
This video will describe in practical detail how researchers studying RNA biology may install and optimally use the computational pipeline, SCRAP, to analyze sncRNA interactions with target RNAs, such as messenger RNAs, in chimeric RNA-sequencing data obtained through one of the discussed biochemical approaches to sequencing library preparation.
SCRAP is a command line utility. Generally, following the guide below, the user will need to (i) download and install SCRAP (https://github.com/Meffert-Lab/SCRAP), (ii) Install reference genomes and run SCRAP, and (iii) perform peak calling and annotation.
Further details of the computational steps in this procedure can be found at&#160;https://github.com/Meffert-Lab/SCRAP. This article will provide the setup and background information to allow investigators with entry-level computational skills to install, optimize, and use SCRAP on chimeric RNA sequencing library datasets.

<table><tbody><tr><td>Genomes</td><td>UCSC Genome browser</td><td>N/A</td><td>https://genome.ucsc.edu/ or https://www.ncbi.nlm.nih.gov/data-hub/genome/</td></tr><tr><td>Linux</td><td>Linux</td><td>Ubuntu 20.04 or 22.04 LTS recommended</td><td></td></tr><tr><td>Mac</td><td>Apple</td><td>Mac OSX (&gt;11)</td><td></td></tr><tr><td>Platform setup</td><td>GitHub</td><td>N/A</td><td>https://github.com/Meffert-Lab/SCRAP/blob/main/PLATFORM-SETUP.md]</td></tr><tr><td>SCRAP pipeline</td><td>GitHub</td><td>N/A</td><td>https://github.com/Meffert-Lab/SCRAP</td></tr><tr><td>Unix shell</td><td>Unix operating system</td><td>bash &gt;=5.0</td><td></td></tr><tr><td>Unix shell</td><td>Unix operating system</td><td>zsh (5.9 recommended)</td><td></td></tr><tr><td>Windows</td><td>Windows</td><td>WSL Ubuntu 20.04 or 22.04 LTS</td><td></td></tr></tbody></table>

computational analysis tutorial for chimeric small noncoding rna: target rna sequencing libraries

An understanding of the in vivo gene regulatory interactions of small noncoding RNAs (sncRNAs), such as microRNAs (miRNAs), with their target RNAs has been advanced in recent years by biochemical approaches which use cross-linking followed by ligation to capture sncRNA:target RNA interactions through the formation of chimeric RNAs and subsequent sequencing libraries. While datasets from chimeric RNA sequencing provide genome-wide and substantially less ambiguous input than miRNA prediction software, distilling this data into meaningful and actionable information requires additional analyses and may dissuade investigators lacking a computational background. This report provides a tutorial to support entry-level computational biologists in installing and applying a recent open-source software tool: Small Chimeric RNA Analysis Pipeline (SCRAP). Platform requirements, updates, and an explanation of pipeline steps and manipulation of key user-input variables is provided. Reducing a barrier for biologists to gain insights from chimeric RNA sequencing approaches has the potential to springboard discovery-based investigations of regulatory sncRNA:target RNA interactions in multiple biological contexts.

Results for sncRNA:target RNA detected by a modified version of SCRAP (SCRAP release 2.0, which implements modifications for rRNA filtering) on previously published sequencing datasets prepared using CLEAR-CLIP9 is shown in Figure 2 and Table 1. Users can appreciate the decrease in the relative fraction miRNA interactions with intron regions which occurs following the isolation of high-confidence interactions by peak calling in SCRAP. Additional data ...

Watch this Scientific Journal Video about A Computational Pipeline for Analyzing Chimeric Noncoding RNA-Target RNA Interactions in High-Throughput Sequencing Data at JoVE.com

Author Spotlight: A Computational Pipeline for Analyzing Chimeric Noncoding RNA-Target RNA Interactions in High-Throughput Sequencing Data 

Here, we present a protocol demonstrating the installation and use of a bioinformatics pipeline to analyze chimeric RNA sequencing data used in the study of in vivo RNA:RNA interactions.

Computational Analysis Tutorial for Chimeric Small Noncoding RNA: Target RNA Sequencing Libraries

An understanding of the in vivo gene regulatory interactions of small noncoding RNAs (sncRNAs), such as microRNAs (miRNAs), with their target RNAs has ...

computational-analysis-tutorial-for-chimeric-small-noncoding-rna

Research

JoVE Journal

Biology

572 Views.  Johns Hopkins University School of Medicine. Here, we present a protocol demonstrating the installation and use of a bioinformatics pipeline to analyze chimeric RNA sequencing data used in the study of in vivo RNA:RNA interactions.

Video: Author Spotlight: A Computational Pipeline for Analyzing Chimeric Noncoding RNA-Target RNA Interactions in High-Throughput Sequencing Data 

Using RNA-sequencing to Detect Novel Splice Variants Related to Drug Resistance in In Vitro Cancer Models

Drug resistance remains a major problem in the treatment of cancer for both hematological malignancies and solid tumors. Intrinsic or acquired resistance can be caused by a range of mechanisms, including increased drug elimination, decreased drug uptake, drug inactivation and alterations of drug targets. Recent data showed that other than by well-known genetic (mutation, amplification) and epigenetic (DNA hypermethylation, histone post-translational modification) modifications, drug resistance mechanisms might also be regulated by splicing aberrations. This is a rapidly growing field of investigation that deserves future attention in order to plan more effective therapeutic approaches. The protocol described in this paper is aimed at investigating the impact of aberrant splicing on drug resistance in solid tumors and hematological malignancies. To this goal, we analyzed the transcriptomic profiles of several in vitro models through RNA-seq and established a qRT-PCR based method to validate candidate genes. In particular, we evaluated the differential splicing of DDX5 and PKM transcripts. The aberrant splicing detected by the computational tool MATS was validated in leukemic cells, showing that different DDX5 splice variants are expressed in the parental vs. resistant cells. In these cells, we also observed a higher PKM2/PKM1 ratio, which was not detected in the Panc-1 gemcitabine-resistant counterpart compared to parental Panc-1 cells, suggesting a different mechanism of drug-resistance induced by gemcitabine exposure.

Using RNA-sequencing to Detect Novel Splice Variants Related to Drug Resistance in In Vitro Cancer Models

Here we describe a protocol aimed at investigating the impact of aberrant splicing on drug resistance in solid tumors and hematological malignancies. To this goal, we analyzed the transcriptomic profiles of parental and resistant in vitro models through RNA-seq and established a qRT-PCR based method to validate candidate genes.

Drug resistance remains a major problem in the treatment of cancer for both hematological malignancies and solid tumors. Intrinsic or acquired resistance ...

Cancer Research

Real-time Analysis of Transcription Factor Binding, Transcription, Translation, and Turnover to Display Global Events During Cellular Activation

Upon activation, cells rapidly change their functional programs and, thereby, their gene expression profile. Massive changes in gene expression occur, for example, during cellular differentiation, morphogenesis, and functional stimulation (such as activation of immune cells), or after exposure to drugs and other factors from the local environment. Depending on the stimulus and cell type, these changes occur rapidly and at any possible level of gene regulation. Displaying all molecular processes of a responding cell to a certain type of stimulus/drug is one of the hardest tasks in molecular biology. Here, we describe a protocol that enables the simultaneous analysis of multiple layers of gene regulation. We compare, in particular, transcription factor binding (Chromatin-immunoprecipitation-sequencing (ChIP-seq)), de novo transcription (4-thiouridine-sequencing (4sU-seq)), mRNA processing, and turnover as well as translation (ribosome profiling). By combining these methods, it is possible to display a detailed and genome-wide course of action.
Sequencing newly transcribed RNA is especially recommended when analyzing rapidly adapting or changing systems, since this depicts the transcriptional activity of all genes during the time of 4sU exposure (irrespective of whether they are up- or downregulated). The combinatorial use of total RNA-seq and ribosome profiling additionally allows the calculation of RNA turnover and translation rates. Bioinformatic analysis of high-throughput sequencing results allows for many means for analysis and interpretation of the data. The generated data also enables tracking co-transcriptional and alternative splicing, just to mention a few possible outcomes.
The combined approach described here can be applied for different model organisms or cell types, including primary cells. Furthermore, we provide detailed protocols for each method used, including quality controls, and discuss potential problems and pitfalls.

This protocol describes the combinatorial use of ChIP-seq, 4sU-seq, total RNA-seq, and ribosome profiling for cell lines and primary cells. It enables tracking changes in transcription-factor binding, de novo transcription, RNA processing, turnover and translation over time, and displaying the overall course of events in activated and/or rapidly changing cells.

Upon activation, cells rapidly change their functional programs and, thereby, their gene expression profile. Massive changes in gene expression occur, for ...

Genetics

Isolating, Sequencing and Analyzing Extracellular MicroRNAs from Human Mesenchymal Stem Cells

Extracellular and circulating RNAs (exRNA) are produced by many cell types of the body and exist in numerous bodily fluids such as saliva, plasma, serum, milk and urine. One subset of these RNAs are the posttranscriptional regulators &#8211; microRNAs (miRNAs). To delineate the miRNAs produced by specific cell types, in vitro culture systems can be used to harvest and profile exRNAs derived from one subset of cells. The secreted factors of mesenchymal stem cells are implicated in alleviating numerous diseases and is used as the in vitro model system here. This paper describes the process of collection, purification of small RNA and library generation to sequence extracellular miRNAs. ExRNAs from culture media differ from cellular RNA by being low RNA input samples, which calls for optimized procedures. This protocol provides a comprehensive guide to small exRNA sequencing from culture media, showing quality control checkpoints at each step during exRNA purification and sequencing.

This protocol demonstrates how to purify extracellular microRNAs from cell culture media for small RNA library construction and next generation sequencing. Various quality control checkpoints are described to allow readers to understand what to expect when working with low input samples like exRNAs.

Extracellular and circulating RNAs (exRNA) are produced by many cell types of the body and exist in numerous bodily fluids such as saliva, plasma, serum, ...

A Complete Pipeline for Isolating and Sequencing MicroRNAs, and Analyzing Them Using Open Source Tools

Half of all human transcripts are thought to be regulated by microRNAs. Therefore, quantifying microRNA expression can reveal underlying mechanisms in disease states and provide therapeutic targets and biomarkers. Here, we detail how to accurately quantify microRNAs. Briefly, this method describes isolating microRNAs, ligating them to adaptors suitable for high-throughput sequencing, amplifying the final products, and preparing a sample library. Then, we explain how to align the obtained sequencing reads to microRNA hairpins, and quantify, normalize, and calculate their differential expression. Versatile and robust, this combined experimental workflow and bioinformatic analysis enables users to begin with tissue extraction and finish with microRNA quantification.

Here, we describe a step-by-step strategy for isolating small RNAs, enriching for microRNAs, and preparing samples for high-throughput sequencing. We then describe how to process sequence reads and align them to microRNAs, using open source tools.

Half of all human transcripts are thought to be regulated by microRNAs. Therefore, quantifying microRNA expression can reveal underlying mechanisms in ...

Identification of Circular RNAs using RNA Sequencing

Circular RNAs (circRNAs) are a class of non-coding RNAs involved in functions including micro-RNA (miRNA) regulation, mediation of protein-protein interactions, and regulation of parental gene transcription. In classical next generation RNA sequencing (RNA-seq), circRNAs are typically overlooked as a result of poly-A selection during construction of mRNA libraries, or are found at very low abundance, and are therefore difficult to isolate and detect. Here, a circRNA library construction protocol was optimized by comparing library preparation kits, pre-treatment options and various total RNA input amounts. Two commercially available whole transcriptome library preparation kits, with and without RNase R pre-treatment, and using variable amounts of total RNA input (1 to 4 &#181;g), were tested. Lastly, multiple tissue types; including liver, lung, lymph node, and pancreas; as well as multiple brain regions; including the cerebellum, inferior parietal lobe, middle temporal gyrus, occipital cortex, and superior frontal gyrus; were compared to evaluate circRNA abundance across tissue types. Analysis of the generated RNA-seq data using six different circRNA detection tools (find_circ, CIRI, Mapsplice, KNIFE, DCC, and CIRCexplorer) revealed that a stranded total RNA library preparation kit with RNase R pre-treatment and 4 &#181;g RNA input is the optimal method for identifying the highest relative number of circRNAs. Consistent with previous findings, the highest enrichment of circRNAs was observed in brain tissues compared to other tissue types.

Circular RNAs (circRNAs) are non-coding RNAs that may have roles in transcriptional regulation and mediating interactions between proteins. Following assessment of different parameters for construction of circRNA sequencing libraries, a protocol was compiled utilizing stranded total RNA library preparation with RNase R pre-treatment and is presented here.

Circular RNAs (circRNAs) are a class of non-coding RNAs involved in functions including micro-RNA (miRNA) regulation, mediation of protein-protein ...

Improving Small RNA-seq: Less Bias and Better Detection of 2'-O-Methyl RNAs

The study of small RNAs (sRNAs) by next-generation sequencing (NGS) is challenged by bias issues during library preparation. Several types of sRNA such as plant microRNAs (miRNAs) carry a 2&#39;-O-methyl (2&#39;-OMe) modification at their 3&#39;&#160;terminal nucleotide. This modification adds another difficulty as it inhibits 3&#39;&#160;adapter ligation. We previously demonstrated that modified versions of the &#39;TruSeq (TS)&#39;&#160;protocol have less bias and an improved detection of 2&#39;-OMe RNAs. Here we describe in detail protocol &#39;TS5&#39;, which showed the best overall performance. TS5 can be followed either using homemade reagents or reagents from the TS kit, with equal performance.

We present a detailed small RNA library reparation protocol with less bias than standard methods and an increased sensitivity for 2&#39;-O-methyl RNAs. This protocol can be followed using homemade reagents to save cost or using kits for convenience.

The study of small RNAs (sRNAs) by next-generation sequencing (NGS) is challenged by bias issues during library preparation. Several types of sRNA such as ...

Analyzing tRNA Dynamics Using AQRNA-seq

AQRNA-seq for Quantifying Small RNAs

AQRNA-seq provides a direct linear relationship between sequencing read counts and small RNA copy numbers in a biological sample, thus enabling accurate quantification of the pool of small RNAs. The AQRNA-seq library preparation procedure described here involves the use of custom-designed sequencing linkers and a step for reducing methylation RNA modifications that block reverse transcription processivity, which results in an increased yield of full-length cDNAs. In addition, a detailed implementation of the accompanying bioinformatics pipeline is presented. This demonstration of AQRNA-seq was conducted through a quantitative analysis of the 45 tRNAs in Mycobacterium bovis BCG harvested on 5 selected days across a 20-day time course of nutrient deprivation and 6 days of resuscitation. Ongoing efforts to improve the efficiency and rigor of AQRNA-seq will also be discussed here. This includes exploring methods to obviate gel purification for mitigating primer dimer issues after PCR amplification and to increase the proportion of full-length reads to enable more accurate read mapping. Future enhancements to AQRNA-seq will be focused on facilitating automation and high-throughput implementation of this technology for quantifying all small RNA species in cell and tissue samples from diverse organisms.

Author Spotlight: AQRNA-seq Role in Mapping Small RNAs and Unraveling Protein Translation Mechanisms

Absolute quantification RNA sequencing (AQRNA-seq) is a technology developed to quantify the landscape of all small RNAs in biological mixtures. Here, both the library preparation and data processing steps of AQRNA-seq are demonstrated, quantifying changes in the transfer RNA (tRNA) pool in Mycobacterium bovis BCG during starvation-induced dormancy.

AQRNA-seq provides a direct linear relationship between sequencing read counts and small RNA copy numbers in a biological sample, thus enabling accurate ...

Genome-wide Analysis of HDAC Inhibitor-mediated Modulation of microRNAs and mRNAs in B Cells Induced to Undergo Class-switch DNA Recombination and Plasma Cell Differentiation

Antibody responses are accomplished through several critical B cell-intrinsic processes, including somatic hypermutation (SHM), class-switch DNA recombination (CSR), and plasma cell differentiation. In recent years, epigenetic modifications or factors, such as histone deacetylation and microRNAs (miRNAs), have been shown to interact with B-cell genetic programs to shape antibody responses, while the dysfunction of epigenetic factors has been found to lead to autoantibody responses. Analyzing genome-wide miRNA and mRNA expression in B cells in response to epigenetic modulators is important for understanding the epigenetic regulation of B-cell function and antibody response. Here, we demonstrate a protocol for inducing B cells to undergo CSR and plasma cell differentiation, treating these B cells with histone deacetylase (HDAC) inhibitors (HDIs), and analyzing mRNA and microRNA expression. In this protocol, we directly analyze complementary DNA (cDNA) sequences using next-generation mRNA sequencing (mRNA-seq) and miRNA-seq technologies, mapping of the sequencing reads to the genome, and quantitative reverse transcription (qRT)-PCR. With these approaches, we have defined that, in B cells induced to undergo CSR and plasma cell differentiation, HDI, an epigenetic regulator, selectively modulates miRNA and mRNA expression and alters CSR and plasma cell differentiation.

Epigenetic factors can interact with genetic programs to modulate gene expression and regulate B cell function. By combining in vitro B-cell stimulation, qRT-PCR, and high-throughput microRNA-sequence and mRNA-sequence approaches, we can analyze the epigenetic modulation of miRNA and gene expression in B cells.

Antibody responses are accomplished through several critical B cell-intrinsic processes, including somatic hypermutation (SHM), class-switch DNA ...

Immunology and Infection

RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells

The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the cell is affected by external signals (e.g. drug treatment), or how cells differ between a healthy state and a diseased state. With the advent and continuous refinement of next-generation DNA sequencing technology, RNA-sequencing (RNA-seq) has become an increasingly popular method of transcriptome analysis to catalog all species of transcripts, to determine the transcriptional structure of all expressed genes and to quantify the changing expression levels of the total set of transcripts in a given cell, tissue or organism1,2 . RNA-seq is gradually replacing DNA microarrays as a preferred method for transcriptome analysis because it has the advantages of profiling a complete transcriptome, providing a digital type datum (copy number of any transcript) and not relying on any known genomic sequence3.
Here, we present a complete and detailed protocol to apply RNA-seq to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is based on our recent published study entitled "RNA-seq Reveals Novel Transcriptome of Genes and Their Isoforms in Human Pulmonary Microvascular Endothelial Cells Treated with Thrombin,"4 in which we successfully performed the first complete transcriptome analysis of human pulmonary microvascular endothelial cells treated with thrombin using RNA-seq. It yielded unprecedented resources for further experimentation to gain insights into molecular mechanisms underlying thrombin-mediated endothelial dysfunction in the pathogenesis of inflammatory conditions, cancer, diabetes, and coronary heart disease, and provides potential new leads for therapeutic targets to those diseases.
The descriptive text of this protocol is divided into four parts. The first part describes the treatment of human pulmonary microvascular endothelial cells with thrombin and RNA isolation, quality analysis and quantification. The second part describes library construction and sequencing. The third part describes the data analysis. The fourth part describes an RT-PCR validation assay. Representative results of several key steps are displayed. Useful tips or precautions to boost success in key steps are provided in the Discussion section. Although this protocol uses human pulmonary microvascular endothelial cells treated with thrombin, it can be generalized to profile transcriptomes in both mammalian and non-mammalian cells and in tissues treated with different stimuli or inhibitors, or to compare transcriptomes in cells or tissues between a healthy state and a disease state.

This protocol presents a complete and detailed procedure to apply RNA-seq, a powerful next-generation DNA sequencing technology, to profile transcriptomes in human pulmonary microvascular endothelial cells with or without thrombin treatment. This protocol is generalizable to various cells or tissues affected by different reagents or disease states.

The characterization of gene expression in cells via measurement of mRNA levels is a useful tool in determining how the transcriptional machinery of the ...

Targeted RNA Sequencing Assay to Characterize Gene Expression and Genomic Alterations

RNA sequencing (RNAseq) is a versatile method that can be utilized to detect and characterize gene expression, mutations, gene fusions, and noncoding RNAs. Standard RNAseq requires 30 - 100 million sequencing reads and can include multiple RNA products such as mRNA and noncoding RNAs. We demonstrate how targeted RNAseq (capture) permits a focused study on selected RNA products using a desktop sequencer. RNAseq capture can characterize unannotated, low, or transiently expressed transcripts that may otherwise be missed using traditional RNAseq methods. Here we describe the extraction of RNA from cell lines, ribosomal RNA depletion, cDNA synthesis, preparation of barcoded libraries, hybridization and capture of targeted transcripts and multiplex sequencing on a desktop sequencer. We also outline the computational analysis pipeline, which includes quality control assessment, alignment, fusion detection, gene expression quantification and identification of single nucleotide variants. This assay allows for targeted transcript sequencing to characterize gene expression, gene fusions, and mutations.

We describe a targeted RNA sequencing-based method that includes preparation of indexed cDNA libraries, hybridization and capture with custom probes and data analysis to interrogate selected transcripts for gene expression, mutations, and gene fusions. Targeted RNAseq permits cost-effective, rapid evaluation of selected transcripts on a desktop sequencer.

RNA sequencing (RNAseq) is a versatile method that can be utilized to detect and characterize gene expression, mutations, gene fusions, and noncoding ...

Computational Analysis Tutorial for Chimeric Small Noncoding RNA: Target RNA Sequencing Libraries

Summary

Explore More Videos

Chapters in this video

Using RNA-sequencing to Detect Novel Splice Variants Related to Drug Resistance in In Vitro Cancer Models

Real-time Analysis of Transcription Factor Binding, Transcription, Translation, and Turnover to Display Global Events During Cellular Activation

Isolating, Sequencing and Analyzing Extracellular MicroRNAs from Human Mesenchymal Stem Cells

A Complete Pipeline for Isolating and Sequencing MicroRNAs, and Analyzing Them Using Open Source Tools

Identification of Circular RNAs using RNA Sequencing

Improving Small RNA-seq: Less Bias and Better Detection of 2'-O-Methyl RNAs

AQRNA-seq for Quantifying Small RNAs

Genome-wide Analysis of HDAC Inhibitor-mediated Modulation of microRNAs and mRNAs in B Cells Induced to Undergo Class-switch DNA Recombination and Plasma Cell Differentiation

RNA-seq Analysis of Transcriptomes in Thrombin-treated and Control Human Pulmonary Microvascular Endothelial Cells

Targeted RNA Sequencing Assay to Characterize Gene Expression and Genomic Alterations