1. Software dependencies and installation
<ol>
	<li>Install software dependencies from their home site or using conda.
	<ol>
		<li>Download and install Perl, if it is not already installed, from its home site (https://www.perl.org/get.html). 
		NOTE: Represented results were predicted using Perl v5.32.0.</li>
		<li>Download Blast+, an alignment program, from its home site (https://www.ncbi.nlm.nih.gov/books/NBK279671/) as an executable and as source code. 
		NOTE: Represented results were predicted using the BLAST 2.6.0+.</li>
		<li>Install precompiled package of RNAfold from https://www.tbi.univie.ac.at/RNA/.</li>
		<li>Alternatively, install these softwares using the following conda: i)&#160;conda install -c bioconda blast; ii) conda install -c bioconda viennarna.</li>
	</ol>
	</li>
</ol>
2. The mirMachine setup and testing
<ol>
	<li>Download the latest version of the mirMachine scripts and the mirMachine submission script from GitHub, https://github.com/hbusra/mirMachine.git, and then set the scripts path into the&#160;PATH.</li>
	<li>Use the test data provided at the GitHub to make sure that the mirMachine along with all its dependencies have been downloaded correctly.</li>
	<li>Run the mirMachine on the test data shown below. 
	bash mirMachine_submit.sh -f iwgsc_v2_chr5A.fasta -i mature_high_conf_v22_1.fa.filtered.fasta -n 10 
	NOTE: Set the -n option to 10 as the test data contains only one chromosome of the wheat genome. At defaults, the -n option is set to 20.</li>
	<li>Control the hairpins.tbl.out.tbl output files for the predicted mature miRNAs, their predicted precursors, and their locations on the chromosomes.</li>
	<li>Check the log files for the program outputs and warnings.</li>
</ol>
3. Homology-based miRNA identification
<ol>
	<li>Run the mirMachine using the bash script shown below: 
	bash mirMachine_submit.sh -f $genome_file -i $input_file -m $mismatches -n $number_of_hits</li>
	<li>Check the predicted miRNAs. Find the output file named $input_file.results.tbl.hairpins.tbl.out.tbl for the predicted miRNAs. Find the output file named $input_file.results.tbl.hairpins.fsa for the pre-miRNA FASTA sequences. Find the output file named $input_file.results.tbl.hairpins.log for the hairpin log file.</li>
</ol>
4. Novel miRNA identification
<ol>
	<li>Preprocess the sRNA-seq FASTQ files into proper FASTA format. Trim adaptors if needed. Do not trim low-quality reads; instead, remove them. Remove reads containing N. Convert the FASTQ file into FASTA file ($input_file).</li>
	<li>Run the mirMachine using the bash script shown below. 
	bash mirMachine_submit.sh -f $genome_file -i $input_file -n $number_of_hits -sRNAseq -lmax $lmax -lmin $lmin -rpm $rpm 
	NOTE: $mismatches was set to 0 for sRNA-seq based predictions.</li>
	<li>Check the predicted miRNAs. Find the output file named $input_file.results.tbl.hairpins.tbl.out.tbl for the predicted miRNAs. Find the output file named $input_file.results.tbl.hairpins.fsa for the pre-miRNA FASTA sequences. Find the output file named $input_file.results.tbl.hairpins.log for the hairpin log file.</li>
</ol>
5. Advance parameters
NOTE: The defaults are defined for all the parameters except for the genome file and the input miRNA file.
<ol>
	<li>Set the -db option to a blast database to skip the building reference database within the pipeline.</li>
	<li>Set the -m option to the number of mismatches allowed. 
	NOTE: At defaults, -m option was set to 1 for homology-based predictions and 0 for the sRNA-seq-based predictions.</li>
	<li>Set the -n to the number of hits to eliminate after alignment (default to 20). Change this based on the species.</li>
	<li>Use the -long to assess the secondary structures for the suspect list.</li>
	<li>Use the -s to activate the novel miRNA prediction based on sRNA-seq data.</li>
	<li>Set the -lmax option to the maximum length of the sRNA-seq reads to include in the screening.</li>
	<li>Set the -lmax option to the minimum length of the sRNA-seq reads to include in the screening.</li>
	<li>Use the -rpm option to set the Reads Per Million (RPM) threshold. 
	NOTE: For advanced parameters like the length of pri-miRNAs/pre-miRNAs, experienced users are encouraged to modify the scripts for their research of interest. Additionally, if the users intend to skip some steps or prefer to use modified outputs, the submission script can be modified by simply adding # at the beginning of the lines to skip those lines.</li>
</ol>

<ol>
	<li>Voinnet, O. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Origin,+biogenesis,+and+activity+of+plant+microRNAs.">Origin, biogenesis, and activity of plant microRNAs.</a> Cell. 136 (4), 669-687 (2009).</li><li>Budak, H., Akpinar, B. A. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Plant+miRNAs:+biogenesis,+organization+and+origins.">Plant miRNAs: biogenesis, organization and origins.</a> Functional &amp; Integrative Genomics. 15 (5), 523-531 (2015).</li><li>Lee, R. C., Feinbaum, R. L., Ambros, V. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=The+C.+elegans+heterochronic+gene+lin-4+encodes+small+RNAs+with+antisense+complementarity+to+lin-14.">The C. elegans heterochronic gene lin-4 encodes small RNAs with antisense complementarity to lin-14.</a> Cell. 75 (5), 843-854 (1993).</li><li>Zhang, L., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Exogenous+plant+MIR168a+specifically+targets+mammalian+LDLRAP1:+evidence+of+cross-kingdom+regulation+by+microRNA.">Exogenous plant MIR168a specifically targets mammalian LDLRAP1: evidence of cross-kingdom regulation by microRNA.</a> Cell Research. 22 (1), 107-126 (2012).</li><li>Pang, K. C., Frith, M. C., Mattick, J. S. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Rapid+evolution+of+noncoding+RNAs:+Lack+of+conservation+does+not+mean+lack+of+function.">Rapid evolution of noncoding RNAs: Lack of conservation does not mean lack of function.</a> Trends in Genetics. 22 (1), 1-5 (2006).</li><li>Guleria, P., Mahajan, M., Bhardwaj, J., Yadav, S. K. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Plant+small+RNAs:+biogenesis,+mode+of+action+and+their+roles+in+abiotic+stresses.">Plant small RNAs: biogenesis, mode of action and their roles in abiotic stresses.</a> Genomics, Proteomics and Bioinformatics. 9 (6), 183-199 (2011).</li><li>Jones-Rhoades, M. W., Bartel, D. P., Bartel, B. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=MicroRNAs+and+their+regulatory+roles+in+plants.">MicroRNAs and their regulatory roles in plants.</a> Annual Review of Plant Biology. 57, 19-53 (2006).</li><li>Singh, A., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Plant+small+RNAs:+advancement+in+the+understanding+of+biogenesis+and+role+in+plant+development.">Plant small RNAs: advancement in the understanding of biogenesis and role in plant development.</a> Planta. 248 (3), 545-558 (2018).</li><li>Lucas, S. J., Budak, H. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Sorting+the+wheat+from+the+chaff:+identifying+miRNAs+in+genomic+survey+sequences+of+Triticum+aestivum+chromosome+1AL.">Sorting the wheat from the chaff: identifying miRNAs in genomic survey sequences of Triticum aestivum chromosome 1AL.</a> PloS One. 7 (7), 40859 (2012).</li><li>Li, S., Castillo-Gonz&#225;lez, C., Yu, B., Zhang, X. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=The+functions+of+plant+small+RNAs+in+development+and+in+stress+responses.">The functions of plant small RNAs in development and in stress responses.</a> Plant Journal. 90 (4), 654-670 (2017).</li><li>Lee, Y., Jeon, K., Lee, J. T., Kim, S., Kim, V. N. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=MicroRNA+maturation:+Stepwise+processing+and+subcellular+localization.">MicroRNA maturation: Stepwise processing and subcellular localization.</a> EMBO Journal. 21 (17), 4663-4670 (2002).</li><li>Lee, Y., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=MicroRNA+genes+are+transcribed+by+RNA+polymerase+II.">MicroRNA genes are transcribed by RNA polymerase II.</a> EMBO Journal. 23 (2), 4051-4060 (2004).</li><li>Bartel, D. P. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=MicroRNAs:+Genomics,+biogenesis,+mechanism,+and+function.">MicroRNAs: Genomics, biogenesis, mechanism, and function.</a> Cell. 116 (2), 281-297 (2004).</li><li>Lee, Y., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=The+nuclear+RNase+III+Drosha+initiates+microRNA+processing.">The nuclear RNase III Drosha initiates microRNA processing.</a> Nature. 425 (6956), 415-419 (2003).</li><li>Meyers, B. C., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Criteria+for+annotation+of+plant+microRNAs.">Criteria for annotation of plant microRNAs.</a> Plant Cell. 20 (12), 3186-3190 (2008).</li><li>Sanei, M., Chen, X. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Mechanisms+of+microRNA+turnover.">Mechanisms of microRNA turnover.</a> Current Opinion in Plant Biology. 27, 199-206 (2015).</li><li>Li, J., Yang, Z., Yu, B., Liu, J., Chen, X. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Methylation+protects+miRNAs+and+siRNAs+from+a+3&#8242;-end+uridylation+activity+in+Arabidopsis.">Methylation protects miRNAs and siRNAs from a 3&#8242;-end uridylation activity in Arabidopsis.</a> Current Biology. 15 (16), 1501-1507 (2005).</li><li>Rogers, K., Chen, X. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Biogenesis,+turnover,+and+mode+of+action+of+plant+microRNAs.">Biogenesis, turnover, and mode of action of plant microRNAs.</a> Plant Cell. 25 (7), 2383-2399 (2013).</li><li>Axtell, M. J., Meyers, B. C. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Revisiting+criteria+for+plant+microRNA+annotation+in+the+Era+of+big+data.">Revisiting criteria for plant microRNA annotation in the Era of big data.</a> Plant Cell. 30 (2), 272-284 (2018).</li><li>Camacho, C., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=BLAST+:+architecture+and+applications.">BLAST+: architecture and applications.</a> BMC Bioinformatics. 10 (1), 421 (2009).</li><li>Markham, N. R. N., Zuker, M. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=UNAFold:+Software+for+nucleic+acid+folding+and+hybridization.">UNAFold: Software for nucleic acid folding and hybridization.</a> Methods in Molecular Biology. 453, 3-31 (2008).</li><li>Alptekin, B., Akpinar, B. A., Budak, H. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=A+comprehensive+prescription+for+plant+miRNA+identification.">A comprehensive prescription for plant miRNA identification.</a> Frontiers in Plant Science. 7, 2058 (2017).</li><li>Zhang, B., Pan, X., Cannon, C. H., Cobb, G. P., Anderson, T. A. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Conservation+and+divergence+of+plant+microRNA+genes.">Conservation and divergence of plant microRNA genes.</a> Plant Journal. 46 (2), 243-259 (2006).</li><li>Appels, R., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Shifting+the+limits+in+wheat+research+and+breeding+using+a+fully+annotated+reference+genome.">Shifting the limits in wheat research and breeding using a fully annotated reference genome.</a> Science. 361 (6403), 7191 (2018).</li><li>Wang, Y., Kuang, Z., Li, L., Yang, X. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=A+bioinformatics+pipeline+to+accurately+and+efficiently+analyze+the+microRNA+transcriptomes+in+plants.">A bioinformatics pipeline to accurately and efficiently analyze the microRNA transcriptomes in plants.</a> Journal of Visualized Experiments: JoVE. (155), e59864 (2020).</li><li>Kozomara, A., Griffiths-Jones, S. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=MiRBase:+Annotating+high+confidence+microRNAs+using+deep+sequencing+data.">MiRBase: Annotating high confidence microRNAs using deep sequencing data.</a> Nucleic Acids Research. 42, 68-73 (2014).</li><li>Lorenz, R., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=ViennaRNA+Package+2.0.">ViennaRNA Package 2.0.</a> Algorithms for Molecular Biology. 6 (1), 26 (2011).</li><li>Wicker, T., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Impact+of+transposable+elements+on+genome+structure+and+evolution+in+bread+wheat.">Impact of transposable elements on genome structure and evolution in bread wheat.</a> Genome Biology. 19 (1), 103 (2018).</li><li>Flavell, R. B., Bennett, M. D., Smith, J. B., Smith, D. B. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Genome+size+and+the+proportion+of+repeated+nucleotide+sequence+DNA+in+plants.">Genome size and the proportion of repeated nucleotide sequence DNA in plants.</a> Biochemical Genetics. 12 (4), 257-269 (1974).</li><li>Wicker, T., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=The+repetitive+landscape+of+the+5100+Mbp+barley+genome.">The repetitive landscape of the 5100 Mbp barley genome.</a> Mobile DNA. 8, 22 (2017).</li><li>Yang, Q., Ye, Q. A., Liu, Y. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Mechanism+of+siRNA+production+from+repetitive+DNA.">Mechanism of siRNA production from repetitive DNA.</a> Genes and Development. 29 (5), 526-537 (2015).</li><li>Lam, J. K. W., Chow, M. Y. T., Zhang, Y., Leung, S. W. S. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=siRNA+versus+miRNA+as+therapeutics+for+gene+silencing.">siRNA versus miRNA as therapeutics for gene silencing.</a> Molecular Therapy. Nucleic Acids. 4 (9), 252 (2015).</li><li>Bartel, B. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=MicroRNAs+directing+siRNA+biogenesis.">MicroRNAs directing siRNA biogenesis.</a> Nature Structural and Molecular Biology. 12 (7), 569-571 (2005).</li><li>Meng, Y., Shao, C., Wang, H., Chen, M. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Are+all+the+miRBase-registered+microRNAs+true?+A+structure-+and+expression-based+re-examination+in+plants.">Are all the miRBase-registered microRNAs true? A structure- and expression-based re-examination in plants.</a> RNA Biology. 9 (3), 249-253 (2012).</li><li>Berezikov, E., et al. <a target="_blank" href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed&cmd=Search&doptcmdl=Citation&defaultField=Title+Word&term=Evolutionary+flux+of+canonical+microRNAs+and+mirtrons+in+Drosophila.">Evolutionary flux of canonical microRNAs and mirtrons in Drosophila.</a> Nature Genetics. 42 (1), 6-9 (2010).</li></ol>

Our miRNA pipeline, SUmir, has been used for the identification of many plant miRNAs for the last decade. Here, we developed a new, fully automated, and freely available miRNA identification and annotation pipeline, mirMachine. Furthermore, a number of miRNA identification pipelines including, but not limited to the previous pipeline, were dependent on UNAfold software21, which became a commercial software over time, although once being freely available. This new and fully automated mirMachine is ...

Advances in next generation sequencing technologies have widened the understanding of RNA structures and regulatory elements, revealing functionally important non-coding RNAs (ncRNAs). Among different types of ncRNAs, microRNAs (miRNAs) constitute a fundamental regulatory class of small RNAs with a length between 19 and 24 nucleotides in plants1,2. Since the discovery of the first miRNA in the nematode Caenorhabditis elegans3, the presence and the functions of miRNAs have been studied extensively in animal and plant genomes as well4,5,6. miRNAs function by targeting mRNAs for cleavage or translational repression7. Accumulating evidence has also shown that miRNAs are involved in a wide range of biological processes in plants including growth and development8, self-biogenesis9, and several biotic and abiotic stress responses10.
In plants, miRNAs are initially processed from long primary transcripts called pri-miRNAs11. These pri-miRNAs generated by RNA polymerase II inside the nucleus are long transcripts forming an imperfect fold-back structure12. The pri-miRNAs later undergo a cleavage process to produce endogenous single-stranded (ss) hairpin precursors of miRNAs called pre-miRNAs11. The pre-miRNA forms a hairpin-like structure wherein a single strand folds into a double-stranded structure to excise an miRNA duplex (miRNA/miRNA*)13. Dicer-like protein cuts both strands of the miRNA/miRNA* duplex, leaving 2-nucleotide 3&#39;-overhangs14,15. The miRNA duplex is methylated inside the nucleus, which protects the 3&#39;-end of the miRNA from degradation and uridylation activity16,17. A helicase unwinds the methylated miRNA duplex after export and exposes the mature miRNA to the RNA-induced silencing complex (RISC) in the cytosol18. One strand of the duplex is mature miRNA incorporated into RISC , whereas the other strand, miRNA*, is degraded. The miRNA-RISC complex binds to the target sequence leading to either mRNA degradation in case of full complementarity or translational repression in case of partial complementarity13.
Based on the expression and biogenesis features, guidelines for miRNA annotation have been described15,19. With the defined guidelines, Lucas and Budak developed the SUmir pipeline to perform a homology-based in silico miRNA identification in plants9. The SUmir pipeline was composed of two scripts: SUmirFind and SUmirFold. SUmirFind performs similarity searches against known miRNA datasets through National Center for Biotechnology Information (NCBI) Basic Local Alignment Search tool (BLAST) screening with modified parameters to include hits with only 2 or fewer mismatches and to avoid bias towards shorter hits (blastn-short -ungapped -penalty -1 -reward 1). SUmirFold evaluates the secondary structure of the putative miRNA sequences from BLAST20 results using UNAfold21. SUmirFold differentiates miRNAs from small interfering RNAs by the identification of the characteristics of hairpin structure. Moreover, it differentiates miRNAs from other ssRNAs such as tRNA and rRNA by the parameters, minimum fold energy index &#62; 0.67 and GC content of 24-71%. This pipeline has been recently updated by adding two additional steps to (i) increase sensitivity, (ii) increase annotation accuracy, and (iii) provide genomic distribution of the predicted miRNA genes22. Given the high conservation of plant miRNA sequences23, this pipeline was originally designed for homology-based miRNA prediction. Novel miRNAs, however, could not be accurately identified with this bioinformatics analysis as it heavily relied on sequence conservation of miRNAs between closely related species.
This paper presents a new and fully automated miRNA pipeline, mirMachine that 1) can identify known and novel miRNAs more accurately (for example, the pipeline now uses sRNA-seq-based novel miRNA predictions as well as homology-based miRNA identification) and 2) is fully automated and freely available. The outputs have also included the genomic distributions of the predicted miRNAs. mirMachine was tested for both homology-based and sRNA-seq-based predictions in wheat and Arabidopsis genomes. Although initially released as free software, UNAfold became a commercial software in the last decade. With this upgrade, the secondary structure prediction tool was switched from UNAfold to RNAfold so that mirMachine can be freely available. Users can now execute a short submission script to run the fully automated mirMachine pipeline (examples are provided at https://github.com/hbusra/mirMachine.git).

<table><tbody><tr><td>https://www.ncbi.nlm.nih.gov/books/NBK279671/</td><td></td><td></td><td>Blast+</td></tr><tr><td>&lt;strong&gt;https://github.com/hbusra/mirMachine.git&lt;/strong&gt;</td><td></td><td></td><td>mirMachine submission script</td></tr><tr><td>https://www.perl.org/get.html</td><td></td><td></td><td>Perl</td></tr><tr><td>https://www.tbi.univie.ac.at/RNA/</td><td></td><td></td><td>RNAfold</td></tr><tr><td>&lt;em&gt;Arabidopsis &lt;/em&gt;TAIR10</td><td></td><td></td><td></td></tr><tr><td>&lt;em&gt;Triticum aestivum&lt;/em&gt; (wheat, IWGSC RefSeq v2)</td><td></td><td></td><td></td></tr></tbody></table>

mirmachine: a one-stop shop for plant mirna annotation

Of different types of noncoding RNAs, microRNAs (miRNAs) have arguably been in the spotlight over the last decade. As post-transcriptional regulators of gene expression, miRNAs play key roles in various cellular pathways, including both development and response to a/biotic stress, such as drought and diseases. Having high-quality reference genome sequences enabled identification and annotation of miRNAs in several plant species, where miRNA sequences are highly conserved. As computational miRNA identification and annotation processes are mostly error-prone processes, homology-based predictions increase prediction accuracy. We developed and have improved the miRNA annotation pipeline, SUmir, in the last decade, which has been used for several plant genomes since then. 
This study presents a fully automated, new miRNA pipeline, mirMachine (miRNA Machine), by (i) adding an additional filtering step on the secondary structure predictions, (ii) making it fully automated, and (iii) introducing new options to predict either known miRNA based on homology or novel miRNAs based on small RNA sequencing reads using the previous pipeline. The new miRNA pipeline, mirMachine, was tested using The Arabidopsis Information Resource, TAIR10, release of the Arabidopsis genome and the International Wheat Genome Sequencing Consortium (IWGSC) wheat reference genome v2.

The miRNA pipeline, mirMachine, described above was applied to the test data for the fast evaluation of the performance of the pipeline. Only the high-confidence plant miRNAs deposited at miRBase v22.1 were screened against the chromosome 5A of IWGSC wheat RefSeq genome v224. mirMachine_find returned 312 hits for the nonredundant list of 189 high-confidence miRNAs with a maximum of 1 mismatch allowed (Table 1). mirMachine_fold classified 49 of them as putative miRNAs depending on ...

Watch this Scientific Journal Video about mirMachine: A One-Stop Shop for Plant miRNA Annotation at JoVE.com

mirMachine: A One-Stop Shop for Plant miRNA Annotation

Herein, we present a new and fully automated miRNA pipeline, mirMachine that 1) can identify known and novel miRNAs more accurately and 2) is fully automated and freely available. Users can now execute a short submission script to run the fully automated mirMachine pipeline.

Of different types of noncoding RNAs, microRNAs (miRNAs) have arguably been in the spotlight over the last decade. As post-transcriptional regulators of ...

mirmachine-a-one-stop-shop-for-plant-mirna-annotation

Research

JoVE Journal

Biology

2.5K Views.  Crop Improvement and Genetics Research Unit, CA, USA. Herein, we present a new and fully automated miRNA pipeline, mirMachine that 1) can identify known and novel miRNAs more accurately and 2) is fully automated and freely available. Users can now execute a short submission script to run the fully automated mirMachine pipeline.

Video: mirMachine: A One-Stop Shop for Plant miRNA Annotation

Isolating, Sequencing and Analyzing Extracellular MicroRNAs from Human Mesenchymal Stem Cells

Extracellular and circulating RNAs (exRNA) are produced by many cell types of the body and exist in numerous bodily fluids such as saliva, plasma, serum, milk and urine. One subset of these RNAs are the posttranscriptional regulators &#8211; microRNAs (miRNAs). To delineate the miRNAs produced by specific cell types, in vitro culture systems can be used to harvest and profile exRNAs derived from one subset of cells. The secreted factors of mesenchymal stem cells are implicated in alleviating numerous diseases and is used as the in vitro model system here. This paper describes the process of collection, purification of small RNA and library generation to sequence extracellular miRNAs. ExRNAs from culture media differ from cellular RNA by being low RNA input samples, which calls for optimized procedures. This protocol provides a comprehensive guide to small exRNA sequencing from culture media, showing quality control checkpoints at each step during exRNA purification and sequencing.

This protocol demonstrates how to purify extracellular microRNAs from cell culture media for small RNA library construction and next generation sequencing. Various quality control checkpoints are described to allow readers to understand what to expect when working with low input samples like exRNAs.

Extracellular and circulating RNAs (exRNA) are produced by many cell types of the body and exist in numerous bodily fluids such as saliva, plasma, serum, ...

Genetics

Generating Homo- and Heterografts Between Watermelon and Bottle Gourd for the Study of Cold-responsive MicroRNAs

MicroRNAs (miRNAs) are endogenous small non-coding RNAs of about 20 - 24 nt, known to play important roles in plant development and adaptation. There is an accumulating evidence showing that the expressions of certain miRNAs are altered when grafting, an agricultural practice commonly used by farmers to improve crop tolerance to biotic and abiotic stresses. Bottle gourd is an inherently climate-resilient crop compared to many other major cucurbits, including watermelon, rendering it one of the most widely used rootstocks for the latter. The recent advancement of high-throughput sequencing technologies has provided great opportunities to investigate cold-responsive miRNAs and their contributions to heterograft advantages; yet, adequate experimental procedures are a prerequisite for this purpose. Here, we present a detailed protocol for efficiently generating homo- and heterografts between the cold-susceptible watermelon and the cold-tolerant bottle gourd, in addition to methods of tissue sampling, data generation, and data analysis. The presented methods are also useful for other plant-grafting systems, to interrogate miRNA regulations under various environmental stresses, such as heat, drought, and salinity.

Here we present a detailed protocol for efficiently making homo- and heterografts between watermelon and bottle gourd, in addition to methods of tissue sampling, data generation, and data analysis, for the investigation of cold-responsive microRNAs.

MicroRNAs (miRNAs) are endogenous small non-coding RNAs of about 20 - 24 nt, known to play important roles in plant development and adaptation. There is ...

Environment

A Bioinformatics Pipeline to Accurately and Efficiently Analyze the MicroRNA Transcriptomes in Plants

MicroRNAs (miRNAs) are 20- to 24-nucleotide (nt) endogenous small RNAs (sRNAs) extensively existing in plants and animals that play potent roles in regulating gene expression at the post-transcriptional level. Sequencing sRNA libraries by Next Generation Sequencing (NGS) methods has been widely employed to identify and analyze miRNA transcriptomes in the last decade, resulting in a rapid increase of miRNA discovery. However, two major challenges arise in plant miRNA annotation due to increasing depth of sequenced sRNA libraries as well as the size and complexity of plant genomes. First, many other types of sRNAs, in particular, short interfering RNAs (siRNAs) from sRNA libraries, are erroneously annotated as miRNAs by many computational tools. Second, it becomes an extremely time-consuming process for analyzing miRNA transcriptomes in plant species with large and complex genomes. To overcome these challenges, we recently upgraded miRDeep-P (a popular tool for miRNA transcriptome analyses) to miRDeep-P2 (miRDP2 for short) by employing a new filtering strategy, overhauling the scoring algorithm and incorporating newly updated plant miRNA annotation criteria. We tested miRDP2 against sequenced sRNA populations in five representative plants with increasing genomic complexity, including Arabidopsis, rice, tomato, maize and wheat. The results indicate that miRDP2 processed these tasks with very high efficiency. In addition, miRDP2 outperformed other prediction tools regarding sensitivity and accuracy. Taken together, our results demonstrate miRDP2 as a fast and accurate tool for analyzing plant miRNA transcriptomes, therefore a useful tool in helping the community better annotate miRNAs in plants.

A bioinformatics pipeline, namely miRDeep-P2 (miRDP2 for short), with updated plant miRNA criteria and an overhauled algorithm, could accurately and efficiently analyze microRNA transcriptomes in plants, especially for species with complex and large genomes.

MicroRNAs (miRNAs) are 20- to 24-nucleotide (nt) endogenous small RNAs (sRNAs) extensively existing in plants and animals that play potent roles in ...

A Complete Pipeline for Isolating and Sequencing MicroRNAs, and Analyzing Them Using Open Source Tools

Half of all human transcripts are thought to be regulated by microRNAs. Therefore, quantifying microRNA expression can reveal underlying mechanisms in disease states and provide therapeutic targets and biomarkers. Here, we detail how to accurately quantify microRNAs. Briefly, this method describes isolating microRNAs, ligating them to adaptors suitable for high-throughput sequencing, amplifying the final products, and preparing a sample library. Then, we explain how to align the obtained sequencing reads to microRNA hairpins, and quantify, normalize, and calculate their differential expression. Versatile and robust, this combined experimental workflow and bioinformatic analysis enables users to begin with tissue extraction and finish with microRNA quantification.

Here, we describe a step-by-step strategy for isolating small RNAs, enriching for microRNAs, and preparing samples for high-throughput sequencing. We then describe how to process sequence reads and align them to microRNAs, using open source tools.

Half of all human transcripts are thought to be regulated by microRNAs. Therefore, quantifying microRNA expression can reveal underlying mechanisms in ...

Improving Small RNA-seq: Less Bias and Better Detection of 2'-O-Methyl RNAs

The study of small RNAs (sRNAs) by next-generation sequencing (NGS) is challenged by bias issues during library preparation. Several types of sRNA such as plant microRNAs (miRNAs) carry a 2&#39;-O-methyl (2&#39;-OMe) modification at their 3&#39;&#160;terminal nucleotide. This modification adds another difficulty as it inhibits 3&#39;&#160;adapter ligation. We previously demonstrated that modified versions of the &#39;TruSeq (TS)&#39;&#160;protocol have less bias and an improved detection of 2&#39;-OMe RNAs. Here we describe in detail protocol &#39;TS5&#39;, which showed the best overall performance. TS5 can be followed either using homemade reagents or reagents from the TS kit, with equal performance.

We present a detailed small RNA library reparation protocol with less bias than standard methods and an increased sensitivity for 2&#39;-O-methyl RNAs. This protocol can be followed using homemade reagents to save cost or using kits for convenience.

The study of small RNAs (sRNAs) by next-generation sequencing (NGS) is challenged by bias issues during library preparation. Several types of sRNA such as ...

Characterization of Functionally Associated miRNAs in Glioblastoma and their Engineering into Artificial Clusters for Gene Therapy

The biological relevance of microRNAs (miRNAs) in health and disease significantly relies on specific combinations of many simultaneously deregulated miRNAs rather than the action of a single miRNA. The characterization of these specific miRNAs modules is a fundamental step in maximizing their use in therapy. This is extremely relevant because their combinatorial attributes can be practically exploited. Described here is a method to define a specific miRNA signature relevant to the control of oncogenic chromatin repressors in glioblastoma. The approach first defines a general group of miRNAs that are deregulated in tumors in comparison to normal tissue. The analysis is further refined by differential culture conditions, underscoring a subgroup of miRNAs that are co-expressed simultaneously during specific cellular states. Finally, the miRNAs that satisfy these filters are combined into an artificial polycistronic transgenes, which is based on a scaffold of naturally existing miRNA clusters genes, then used for overexpression of these miRNA modules into receiving cells.

Described here is a protocol for characterizing modules of biologically synergistic miRNAs and their assembly into short transgenes, which allows simultaneous overexpression for gene therapy applications.

The biological relevance of microRNAs (miRNAs) in health and disease significantly relies on specific combinations of many simultaneously deregulated ...

RNA Blot Analysis for the Detection and Quantification of Plant MicroRNAs

MicroRNAs (miRNAs) are a class of endogenously expressed non-coding, ~21 nt small RNAs involved in the regulation of gene expression in both plants and animals. Most miRNAs act as negative switches of gene expression targeting key genes. In plants, primary miRNAs (pri-miRNAs) transcripts are generated by RNA polymerase II, and they form varying lengths of stable stem-loop structures called pre-miRNAs. An endonuclease, Dicer-like1, processes the pre-miRNAs into miRNA-miRNA* duplexes. One of the strands from miRNA-miRNA* duplex is selected and loaded onto Argonaute 1 protein or its homologs to mediate the cleavage of target mRNAs. Although miRNAs are key signaling molecules, their detection is often carried out by less than optimal PCR-based methods instead of a sensitive northern blot analysis. We describe a simple, reliable, and extremely sensitive northern method that is ideal for the quantification of miRNA levels with very high sensitivity, literally from any plant tissue. Additionally, this method can be used to confirm the size, stability and the abundance of miRNAs and their precursors.

This method demonstrates use of the northern hybridization technique to detect miRNAs from total RNA extract.

MicroRNAs (miRNAs) are a class of endogenously expressed non-coding, ~21 nt small RNAs involved in the regulation of gene expression in both plants and ...

Biochemistry

A Reporter Assay to Analyze Intronic microRNA Maturation in Mammalian Cells

MicroRNAs (miRNAs) are short RNA molecules that are widespread in eukaryotes. Most miRNAs are transcribed from introns, and their maturation involves different RNA-binding proteins in the nucleus. Mature miRNAs frequently mediate gene silencing, and this has become an important tool for comprehending post-transcriptional events. Besides that, it can be explored as a promising methodology for gene therapies. However, there is currently a lack of direct methods for assessing miRNA expression in mammalian cell cultures. Here, we describe an efficient and simple method that aids in determining miRNA biogenesis and maturation through confirmation of its interaction with target sequences. Also, this system allows the separation of exogenous miRNA maturation from its endogenous activity using a doxycycline-inducible promoter capable of controlling primary miRNA (pri-miRNA) transcription with high efficiency and low cost. This tool also allows modulation with RNA-binding proteins in a separate plasmid. In addition to its use with a variety of different miRNAs and their respective targets, it can be adapted to different cell lines, provided these are amenable to transfection.

We developed an intronic microRNA biogenesis reporter assay to be used in cells in vitro with four plasmids: one with intronic miRNA, one with the target, one to overexpress a regulatory protein, and one for Renilla luciferase. The miRNA was processed and could control luciferase expression by binding to the target sequence.

MicroRNAs (miRNAs) are short RNA molecules that are widespread in eukaryotes. Most miRNAs are transcribed from introns, and their maturation involves ...

Cancer Research

Isolation of microRNAs from Tick Ex Vivo Salivary Gland Cultures and Extracellular Vesicles

Ticks are important ectoparasites that can vector multiple pathogens. The salivary glands of ticks are essential for feeding as their saliva contains many effectors with pharmaceutical properties that can diminish host immune responses and enhance pathogen transmission. One group of such effectors are microRNAs (miRNAs). miRNAs are short non-coding sequences that regulate host gene expression at the tick-host interface and within the organs of the tick. These small RNAs are transported in the tick saliva via extracellular vesicles (EVs), which serve inter-and intracellular communication. Vesicles containing miRNAs have been identified in the saliva of ticks. However, little is known about the roles and profiles of the miRNAs in tick salivary vesicles and glands. Furthermore, the study of vesicles and miRNAs in tick saliva requires tedious procedures to collect tick saliva. This protocol aims to develop and validate a method for isolating miRNAs from purified extracellular vesicles produced by ex vivo organ cultures. The materials and methodology needed to extract miRNAs from extracellular vesicles and tick salivary glands are described herein.

Isolation of microRNAs from Tick Ex Vivo Salivary Gland Cultures and Extracellular Vesicles

The present protocol describes the isolation of microRNAs from tick salivary glands and purified extracellular vesicles. This is a universal procedure that combines commonly used reagents and supplies. The method also allows the use of a small number of ticks, resulting in quality microRNAs that can be readily sequenced.

Ticks are important ectoparasites that can vector multiple pathogens. The salivary glands of ticks are essential for feeding as their saliva contains many ...

CRISPR Gene Editing Tool for MicroRNA Cluster Network Analysis

MicroRNAs (miRNAs) have emerged as important cellular regulators (tumor suppressors, pro-oncogenic factors) of cancer and metastasis. Most published studies focus on a single miRNA when characterizing the role of small RNAs in cancer. However, ~30%&#160;of human miRNA genes are organized in clustered units that are often co-expressed, indicating a complex and coordinated system of noncoding RNA regulation. A clearer understating of how clustered miRNA networks function cooperatively to regulate tumor growth, cancer aggressiveness, and drug resistance is required before translating noncoding small RNAs to the clinic.
The use of a high-throughput clustered regularly interspaced short palindromic repeats (CRISPR)-mediated gene editing procedure has been employed to study the oncogenic role of a genomic cluster of seven miRNA genes located within a locus spanning ~35,000 bp in length in the context of prostate cancer. For this approach, human cancer cell lines were infected with a lentivirus vector for doxycycline (DOX)-inducible Cas9 nuclease grown in DOX-containing medium for 48 h. The cells were subsequently co-transfected with synthetic trans-activating CRISPR RNA (tracrRNA) complexed with genomic site-specific CRISPR RNA (crRNA) oligonucleotides to allow the rapid generation of cancer cell lines carrying the entire miRNA cluster deletion and individual or combination miRNA gene cluster deletions within a single experiment.
The advantages of this high-throughput gene editing system are the ability to avoid time-consuming DNA vector subcloning, the flexibility in transfecting cells with unique guide RNA combinations in a 24-well format, and the lower-cost PCR genotyping using crude cell lysates. Studies using this streamlined approach promise to uncover functional redundancies and synergistic/antagonistic interactions between miRNA cluster members, which will aid in characterizing the complex small noncoding RNA networks involved in human disease and better inform future therapeutic design.

This protocol describes a high-throughput clustered regularly interspaced short palindromic repeats (CRISPR) gene editing workflow for microRNA cluster network analysis that allows the rapid generation of a panel of genetically modified cell lines carrying unique miRNA cluster member deletion combinations as large as 35 kb within a single experiment.

MicroRNAs (miRNAs) have emerged as important cellular regulators (tumor suppressors, pro-oncogenic factors) of cancer and metastasis. Most published ...

mirMachine: A One-Stop Shop for Plant miRNA Annotation

Summary

Explore More Videos

Chapters in this video

Isolating, Sequencing and Analyzing Extracellular MicroRNAs from Human Mesenchymal Stem Cells

Generating Homo- and Heterografts Between Watermelon and Bottle Gourd for the Study of Cold-responsive MicroRNAs

A Bioinformatics Pipeline to Accurately and Efficiently Analyze the MicroRNA Transcriptomes in Plants

A Complete Pipeline for Isolating and Sequencing MicroRNAs, and Analyzing Them Using Open Source Tools

Improving Small RNA-seq: Less Bias and Better Detection of 2'-O-Methyl RNAs

Characterization of Functionally Associated miRNAs in Glioblastoma and their Engineering into Artificial Clusters for Gene Therapy

RNA Blot Analysis for the Detection and Quantification of Plant MicroRNAs

A Reporter Assay to Analyze Intronic microRNA Maturation in Mammalian Cells

Isolation of microRNAs from Tick Ex Vivo Salivary Gland Cultures and Extracellular Vesicles

CRISPR Gene Editing Tool for MicroRNA Cluster Network Analysis