This protocol covers the installation and practical steps for the use of a computational pipeline designed to analyze high-throughput sequencing data from Chimeric Noncoding RNA, Target RNA Interactions. The molecular strategy to form Chimeric RNAs is a relatively recent development that has advantages when compared to prior bioinformatic prediction tools in unambiguously understanding the genome-wide Noncoding RNA, Target RNA Interaction landscape. Sequencing data generated from the high-throughput sequencing of the Chimeric RNAs can present an analysis challenge for new adopters of this strategy.
We have established an open-source, cross-platform computational pipeline that is highly annotated to allow entry-level users to analyze data from their Chimeric Noncoding RNA, Target RNA sequencing experiments. In future experiments, the Meffert-Lab plans to use computational analysis by SCRAP on datasets generated from our ongoing investigations and to post transcriptional regulation in the mammalian nervous system, as well as to maintain an update SCRAP on our GitHub. Before initiating SCRAP installation on the Mac OS X Platform, execute the command, which git, in the terminal, to confirm the existence of Git.
To verify Miniconda installation, type, which conda, in the terminal. To install Miniconda, refer to the platform SETUP markdown file on the Meffert-Lab GitHub Repository, for instructions to install on Windows, MacOS, and Ubuntu. For Mac systems, using the Home Brew Package Manager, use the command, brew install miniconda, to install Miniconda.
To install Conda, run, conda init, specifying the shell in use, then close and reopen the shell. A successful installation displays the base environment activated within the terminal session. Next, run the indicated git clone to obtain the latest copy of the SCRAP source code.
Install Mamba, an improved package solver for Conda. Using the following command, install all the dependencies for SCRAP, from SCRAP_environment. yml to its own Conda environment.
Next, use the indicated bash script specific to the organism whose sncRNA, mRNA interactions are analyzed for SCRAP. Provide the directory of the SCRAP source folder and perform the installation steps, using the files within the fasta and annotation folders. Ensure the directory path is complete and ends with a forward slash.
Refer to the tables in readme. md for the correct Mir-based species abbreviations. For updated reference genome, use the UCSC Genome Browser or the NCBI Genome Data Hub.
Inspect the corresponding species.annotation. bed files in the annotation directory in the SCRAP source folder. If there's a need to analyze a different species, provide an indicated file.
After installing dependencies and SCRAP, use the indicated script to initiate SCRAP analysis with the specified command structure. List the full path to the sample directories without any shorthand and format the sample directories with the folder name matching the sample name. Emphasize that the path listed leads to the directory containing all the sample folders and not to an individual sample folder or file.
Next, list the entire path to the adapter file. Verify that the sample names in the adapter file match the previously mentioned folder names and file names. Specify the sample type, either paired end or single end, and state the choice of RNA filters for pre miRNAs, tRNAs and optionally, RNA cleaning.
List the entire path to the reference directory, miRBase abbreviation and the reference genome abbreviation. After running SCRAP, use the indicated script command to perform Peak_Calling. List the full paths to the directory containing the sample folders and the adapter file.
Then define the criteria for Peak_Calling, including the minimum number of sequencing reads, and the number of distinct sequencing libraries needed for a peak to be recognized. Indicate whether peaks must be contributed to by sncRNAs of the same family, which is relevant for miRNAs that share seed sequences and can bind to overlapping gene targets. Then indicate the full path to the reference directory, miRBase abbreviation and the reference genome abbreviation.
After Peak_Calling, run the Peak_Annotation script. Provide the full path to the resulting peaks. bed from the Peak_Calling to the reference directory and the desired species for annotation.
Use samtools merge to merge all the desired BAM files to facilitate combined visualization. Then use the samtools sort for line by line sorting of merged. bam file.
Index the sorted. bam file using the samtools index, generating a binary samtools format index file, required for genomic visualization tools. Finally, in the Integrative Genomics Viewer, open the sorted.
bam and indexed. by file. The application of a modified version of SCRAP to previously published sequencing datasets revealed a decrease in miRNA Interactions with Intron Regions.
The reduction was observed after isolating high-confidence interactions through Peak_Calling with SCRAP.