Computational identification of micro RNA's yields high false positive predictions. With the recent updates mirMachine provides high sensitivity and specificity in known and novel micro RNA predictions. MirMachine is proved to be superior to the most recent micro RNA algorithms in terms of sensitivity and mirMachine is now fully automated and freely available so everyone can use it with instructions provided.
MirMachine can predict the genome-wide distribution of the putative micro RNAs without limitation of the data specific to tissues and conditions. Availability of the preliminary data prior to msRNA sequencing experiment would speed up the verification process for the important micro RNA genes. Before setting up the machine download and install software dependencies from the respective home sites using the links in the text manuscript.
Alternatively, an easier and faster way to install the software dependencies is to use conduct using the commands provided in the text manuscript. Download the latest version of the mirMachine scripts the mere machine submission script from GitHub. Then set the scripts path into the path tab.
Make sure that the mirMachine and its dependencies have been downloaded correctly by running the mirMachine on test data from GitHub. Check the job status on the screen by the standard output stream. Once all finished text appears, control the output files.
Control the hairpins dot TBL dot out dot TBL output files. Notice the mature miRNA pre miRNA and the miRNAs star sequences on the second, third, and sixth columns. Find the location of the predicted miRNAs on the chromosomes at the end of each line.
Then check the log files for the program outputs and warnings. For homology based identification run the mirMachine using the bash script and check the predicted mRNAs. Find the output file for the mature miRNAs and pre miRNA fastest sequences, as well as the output file for the hairpin log file.
For novel miRNA identification pre-process the sRNA-seq Fast Q files into the proper FASTA format. Trim the adapters and remove low quality reads like those containing N as pre-processing for the sRNA-seq reads are not part of mirMachine. Choose a stable trimming tool and trimming parameters for the data submitted.
Convert the FASTQ file into FASTA file as input file. If an abundance table file is provided use the modification script provided with the mirMachine scripts to convert the table file into a proper FASTA input. Then run the mirMachine using the bash script.
Check the predicted miRNAs. Find the output file for the predicted mRNAs and prem miRNA FASTA sequences and the output file for the hairpin log file. Set the dash DB option to a blast database to skip the building reference database within the pipeline.
Set the dash M option to the number of mismatches allowed and the dash N to the number of hits to eliminate after alignment. Change this based on the species. Use the dash long to assess the secondary structures for the suspect list, and the dash as to activate the novel miRNA prediction based on sRNA-seq data.
Set the dash L max option to the maximum length of the S RNA-seq reads to include in the screening and the dash L min option to the minimum length of the sRNA-seq reads to have in the screening. Use the dash RPM option to set the reads per million threshold. In the representative analysis.
The distribution of mRNA families from the chromosome five A of the IWGSC Wheat is displayed. The highest represented group of mRNAs was miR9666 with 18 identified miRNAs. Performance of the mirMachine on Arabidopsis thaliana and wheat species compared with the miRDP2.
Comparisons of the sensitivity and the number two positives showed that mirMachine outperformed miRDP2 on the Arabidopsis data. For the wheat data mirMachine homology based prediction with expression evidence provided better sensitivity than miRDP2. For both genomes miRDP2 predicted a higher number of true positives than sRNAs-seq and homology based predictions.
Setting up the required application could be burdensome for the inexperienced user. Following this video tutorial would help. Visualization of the installation process will encourage the non computational researchers to get into some basic computing.
Additionally, investigating the output files in the video will definitely help the users interpret their results.