A subscription to JoVE is required to view this content. Sign in or start your free trial.
* These authors contributed equally
Translating ribosomes decode three nucleotides per codon into peptides. Their movement along mRNA, captured by ribosome profiling, produces the footprints exhibiting characteristic triplet periodicity. This protocol describes how to use RiboCode to decipher this prominent feature from ribosome profiling data to identify actively translated open reading frames at the whole-transcriptome level.
Identification of open reading frames (ORFs), especially those encoding small peptides and being actively translated under specific physiological contexts, is critical for comprehensive annotations of context-dependent translatomes. Ribosome profiling, a technique for detecting the binding locations and densities of translating ribosomes on RNA, offers an avenue to rapidly discover where translation is occurring at the genome-wide scale. However, it is not a trivial task in bioinformatics to efficiently and comprehensively identify the translating ORFs for ribosome profiling. Described here is an easy-to-use package, named RiboCode, designed to search for actively translating ORFs of any size from distorted and ambiguous signals in ribosome profiling data. Taking our previously published dataset as an example, this article provides step-by-step instructions for the entire RiboCode pipeline, from preprocessing of the raw data to interpretation of the final output result files. Furthermore, for evaluating the translation rates of the annotated ORFs, procedures for visualization and quantification of ribosome densities on each ORF are also described in detail. In summary, the present article is a useful and timely instruction for the research fields related to translation, small ORFs, and peptides.
Recently, a growing body of studies has revealed widespread production of peptides translated from ORFs of coding genes and the previously annotated genes as noncoding, such as long noncoding RNAs (lncRNAs)1,2,3,4,5,6,7,8. These translated ORFs are regulated or induced by cells to respond to environmental changes, stress, and cell differentiation1,8,9,10,11,12,13. The translation products of some ORFs have been demonstrated to play important regulatory roles in diverse biological processes in development and physiology. For example, Chng et al.14 discovered a peptide hormone named Elabela (Ela, also known as Apela/Ende/Toddler), which is critical for cardiovascular development. Pauli et al. suggested that Ela also acts as a mitogen that promotes cell migration in the early fish embryo15. Magny et al. reported two micropeptides of less than 30 amino acids regulating calcium transport and affecting regular muscle contraction in the Drosophila heart10.
It remains unclear how many such peptides are encoded by the genome and whether they are biologically relevant. Therefore, systematic identification of these potentially coding ORFs is highly desirable. However, directly determining the products of these ORFs (i.e., protein or peptide) using traditional approaches such as evolutionary conservation16,17 and mass spectrometry18,19 is challenging because the detection efficiency of both approaches is dependent on the length, abundance, and amino acid composition of the produced proteins or peptides. The advent of ribosome profiling, a technique for identifying the ribosome occupancy on mRNAs at nucleotide resolution, has provided a precise way to evaluate the coding potential of different transcripts3,20,21, irrespective of their length and composition. An important and frequently used feature for identifying actively translating ORFs using ribosome profiling is the three-nucleotide (3-nt) periodicity of the ribosome's footprints on mRNA from the start codon to the stop codon. However, ribosome profiling data often have several issues, including low and sparse sequencing reads along ORFs, high sequencing noise, and ribosomal RNA (rRNA) contaminations. Thus, the distorted and ambiguous signals generated by such data weaken the 3-nt periodicity patterns of ribosomes' footprints on mRNA, which ultimately makes the identification of the high-confidence translated ORFs difficult.
A package named "RiboCode" adapted a modified Wilcoxon-signed-rank test and P-value integration strategy to examine whether the ORF has significantly more in-frame ribosome-protected fragments (RPFs) than off-frame RPFs22. It was demonstrated to be highly efficient, sensitive, and accurate for de novo annotation of the translatome in simulated and real ribosome profiling data. Here, we describe how to use this tool to detect the potential translating ORFs from the raw ribosome profiling sequencing datasets generated by the previous study23. These datasets had been used to explore the function of EIF3 subunit "E" (EIF3E) in translation by comparing the ribosome occupancy profiles of MCF-10A cells transfected with control (si-Ctrl) and EIF3E (si-eIF3e) small-interfering RNAs (siRNAs). By applying RiboCode to these example datasets, we detected 5,633 novel ORFs potentially encoding small peptides or proteins. These ORFs were categorized into various types based on their locations relative to the coding regions, including upstream ORFs (uORFs), downstream ORFs (dORFs), overlapped ORFs, ORFs from novel protein-coding genes (novel PCGs), and ORFs from novel nonprotein-coding genes (novel NonPCGs). The RPF read densities on uORFs were significantly increased in EIF3E-deficient cells compared to control cells, which might be at least partially caused by the enrichment of actively translating ribosomes. The localized ribosome accumulation in the region from the 25th to 75th codon of EIF3E-deficient cells indicated a blockage of translation elongation in the early stage. This protocol also shows how to visualize the RPF density of the desired region for examining the 3-nt periodicity patterns of ribosome footprints on identified ORFs. These analyses demonstrate the powerful role of RiboCode in identifying translating ORFs and studying the regulation of translation.
1. Environment setup and RiboCode installation
2. Data preparation
3. Trim adapters and remove rRNA contamination
4. Align the clean reads to the genome
5. Size selection of RPFs and identification of their P-sites
6. De novo annotate translating ORFs
7. (Optional) ORF quantification and statistics
8. (Optional) Visualization of the predicted ORFs
9. (Optional) Metagene analysis using RiboMiner
NOTE: Perform the metagene analysis to assess the influence of EIF3E knockdown on the translation of identified annotated ORFs, following the steps below:
The example ribosome profiling datasets were deposited in the GEO database under the accession number GSE131074. All the files and codes used in this protocol are available from Supplemental files 1-4. By applying RiboCode to a set of published ribosome profiling datasets23, we identified the novel ORFs actively translated in MCF-10A cells treated with control and EIF3E siRNAs. To select the RPF reads that are most likely bound by the tra...
Ribosome profiling offers an unprecedented opportunity to study the ribosomes' action in cells at a genome scale. Precisely deciphering the information carried by the ribosome profiling data could provide insight into which regions of genes or transcripts are actively translating. This step-by-step protocol provides guidance on how to use RiboCode to analyze ribosome profiling data in detail, including package installation, data preparation, command execution, result explanation, and data visualization. The analysis resu...
The authors have no conflicts of interest to disclose.
The authors would like to acknowledge the support from the computational resources provided by the HPCC platform of Xi'an Jiaotong University. Z.X. gratefully thanks the Young Topnotch Talent Support Plan of Xi'an Jiaotong University.
Name | Company | Catalog Number | Comments |
A computer/server running Linux | Any | - | - |
Anaconda or Miniconda | Anaconda | - | Anaconda: https://www.anaconda.com; Miniconda:https://docs.conda.io/en/latest/miniconda.html |
R | R Foundation | - | https://www.r-project.org/ |
Rstudio | Rstudio | - | https://www.rstudio.com/ |
Request permission to reuse the text or figures of this JoVE article
Request PermissionThis article has been published
Video Coming Soon
Copyright © 2025 MyJoVE Corporation. All rights reserved