A subscription to JoVE is required to view this content. Sign in or start your free trial.

In This Article

  • Summary
  • Abstract
  • Introduction
  • Protocol
  • Results
  • Discussion
  • Disclosures
  • Acknowledgements
  • Materials
  • References
  • Reprints and Permissions

Summary

Translating ribosomes decode three nucleotides per codon into peptides. Their movement along mRNA, captured by ribosome profiling, produces the footprints exhibiting characteristic triplet periodicity. This protocol describes how to use RiboCode to decipher this prominent feature from ribosome profiling data to identify actively translated open reading frames at the whole-transcriptome level.

Abstract

Identification of open reading frames (ORFs), especially those encoding small peptides and being actively translated under specific physiological contexts, is critical for comprehensive annotations of context-dependent translatomes. Ribosome profiling, a technique for detecting the binding locations and densities of translating ribosomes on RNA, offers an avenue to rapidly discover where translation is occurring at the genome-wide scale. However, it is not a trivial task in bioinformatics to efficiently and comprehensively identify the translating ORFs for ribosome profiling. Described here is an easy-to-use package, named RiboCode, designed to search for actively translating ORFs of any size from distorted and ambiguous signals in ribosome profiling data. Taking our previously published dataset as an example, this article provides step-by-step instructions for the entire RiboCode pipeline, from preprocessing of the raw data to interpretation of the final output result files. Furthermore, for evaluating the translation rates of the annotated ORFs, procedures for visualization and quantification of ribosome densities on each ORF are also described in detail. In summary, the present article is a useful and timely instruction for the research fields related to translation, small ORFs, and peptides.

Introduction

Recently, a growing body of studies has revealed widespread production of peptides translated from ORFs of coding genes and the previously annotated genes as noncoding, such as long noncoding RNAs (lncRNAs)1,2,3,4,5,6,7,8. These translated ORFs are regulated or induced by cells to respond to environmental changes, stress, and cell differentiation1,8,9,10,11,12,13. The translation products of some ORFs have been demonstrated to play important regulatory roles in diverse biological processes in development and physiology. For example, Chng et al.14 discovered a peptide hormone named Elabela (Ela, also known as Apela/Ende/Toddler), which is critical for cardiovascular development. Pauli et al. suggested that Ela also acts as a mitogen that promotes cell migration in the early fish embryo15. Magny et al. reported two micropeptides of less than 30 amino acids regulating calcium transport and affecting regular muscle contraction in the Drosophila heart10.

It remains unclear how many such peptides are encoded by the genome and whether they are biologically relevant. Therefore, systematic identification of these potentially coding ORFs is highly desirable. However, directly determining the products of these ORFs (i.e., protein or peptide) using traditional approaches such as evolutionary conservation16,17 and mass spectrometry18,19 is challenging because the detection efficiency of both approaches is dependent on the length, abundance, and amino acid composition of the produced proteins or peptides. The advent of ribosome profiling, a technique for identifying the ribosome occupancy on mRNAs at nucleotide resolution, has provided a precise way to evaluate the coding potential of different transcripts3,20,21, irrespective of their length and composition. An important and frequently used feature for identifying actively translating ORFs using ribosome profiling is the three-nucleotide (3-nt) periodicity of the ribosome's footprints on mRNA from the start codon to the stop codon. However, ribosome profiling data often have several issues, including low and sparse sequencing reads along ORFs, high sequencing noise, and ribosomal RNA (rRNA) contaminations. Thus, the distorted and ambiguous signals generated by such data weaken the 3-nt periodicity patterns of ribosomes' footprints on mRNA, which ultimately makes the identification of the high-confidence translated ORFs difficult.

A package named "RiboCode" adapted a modified Wilcoxon-signed-rank test and P-value integration strategy to examine whether the ORF has significantly more in-frame ribosome-protected fragments (RPFs) than off-frame RPFs22. It was demonstrated to be highly efficient, sensitive, and accurate for de novo annotation of the translatome in simulated and real ribosome profiling data. Here, we describe how to use this tool to detect the potential translating ORFs from the raw ribosome profiling sequencing datasets generated by the previous study23. These datasets had been used to explore the function of EIF3 subunit "E" (EIF3E) in translation by comparing the ribosome occupancy profiles of MCF-10A cells transfected with control (si-Ctrl) and EIF3E (si-eIF3e) small-interfering RNAs (siRNAs). By applying RiboCode to these example datasets, we detected 5,633 novel ORFs potentially encoding small peptides or proteins. These ORFs were categorized into various types based on their locations relative to the coding regions, including upstream ORFs (uORFs), downstream ORFs (dORFs), overlapped ORFs, ORFs from novel protein-coding genes (novel PCGs), and ORFs from novel nonprotein-coding genes (novel NonPCGs). The RPF read densities on uORFs were significantly increased in EIF3E-deficient cells compared to control cells, which might be at least partially caused by the enrichment of actively translating ribosomes. The localized ribosome accumulation in the region from the 25th to 75th codon of EIF3E-deficient cells indicated a blockage of translation elongation in the early stage. This protocol also shows how to visualize the RPF density of the desired region for examining the 3-nt periodicity patterns of ribosome footprints on identified ORFs. These analyses demonstrate the powerful role of RiboCode in identifying translating ORFs and studying the regulation of translation.

Protocol

1. Environment setup and RiboCode installation

  1. Open a Linux terminal window and create a conda environment:
    conda create -n RiboCode python=3.8
  2. Switch to the created environment and install RiboCode and dependencies:
    conda activate RiboCode
    conda install -c bioconda ribocode ribominer sra-tools fastx_toolkit cutadapt bowtie star samtools

2. Data preparation

  1. Get genome reference files.
    1. For the reference sequence, go to the Ensemble website at https://www.ensembl.org/index.html, click the top menu Download and left-side menu FTP Download. In the presented table, click FASTA in the column DNA (FASTA) and the row where Species is Human. In the opened page, copy the link of Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz, then download and unzip it in the terminal:
      wget -c \
      http://ftp.ensembl.org/pub/release-104/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
      gzip -d Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
    2. For reference annotation, right-click GTF in the column Gene sets in the last-opened web page. Copy the link of Homo_sapiens.GRCh38.104.gtf.gz and download it using:
      wget -c \
      http://ftp.ensembl.org/pub/release-104/gtf/homo_sapiens/Homo_sapiens.GRCh38.104.gtf.gz
      gzip -d Homo_sapiens.GRCh38.104.gtf.gz

      NOTE: It is recommended to get the GTF file from the Ensemble website as it contains genome annotations organized in a three-level hierarchy, i.e., each gene contains transcripts that contain exons and optional translations (e.g., coding sequences [CDS], translation start site, translation end site). When a gene's or transcript's annotations are missing, for instance, a GTF file obtained from UCSC or NCBI, use GTFupdate to generate an updated GTF with complete parent-child hierarchy annotations: GTFupdate original.gtf > updated.gtf. For the annotation file in the .gff format, use the AGAT toolkit24 or any other tool to convert to the .gtf format.
  2. Get rRNA sequences.
    1. Open UCSC Genome Browser at https://genome.ucsc.edu and click Tools | Table Browser in the dropdown list.
    2. On the opened page, specify Mammal for clade, Human for genome, All Tables for group, rmask for table, and genome for region. For filter, click Create to go to a new page and set repClass as does match rRNA.
    3. Click Submit and then set the output format to sequence and output filename as hg38_rRNA.fa. Finally, click Get output | Get sequence to retrieve the rRNA sequence.
  3. Get ribosome profiling datasets from Sequence Read Archive (SRA).
    1. Download the replicate samples of si-eIF3e treatment group and rename them:
      fastq-dump SRR9047190 SRR9047191 SRR9047192
      mv SRR9047190.fastq si-eIF3e-1.fastq
      mv SRR9047191.fastq si-eIF3e-2.fastq
      mv SRR9047192.fastq si-eIF3e-3.fastq
    2. Download the replicate samples of control group and rename them:
      fastq-dump SRR9047193 SRR9047194 SRR9047195
      mv SRR9047193.fastq si-Ctrl-1.fastq
      mv SRR9047194.fastq si-Ctrl-2.fastq
      mv SRR9047195.fastq si-Ctrl-3.fastq
      NOTE: The SRA accession IDs for these example datasets were obtained from Gene Expression Omnibus (GEO) website25 by searching for GSE131074.

3. Trim adapters and remove rRNA contamination

  1. (Optional) Remove adapters from the sequencing data. Skip this step if the adapter sequences have already been trimmed, as in this case. Otherwise, use cutadapt to trim the adapters from reads.
    for i in si-Ctrl-1 si-Ctrl-2 si-Ctrl-3 si-eIF3e-1 si-eIF3e-2 si-eIF3e-3
    do
    cutadapt -m 15 --match-read-wildcards -a CTGTAGGCACCATCAAT \
    -o ${i}_trimmed.fastq ${i}.fastq
    done
    NOTE: The adapter sequence after -a parameter will vary dependent on cDNA library preparation. Reads shorter than 15 (given by -m) are discarded because the ribosome-protected fragments are usually longer than this size.
  2. Remove rRNA contamination using the following steps:
    1. Index rRNA reference sequences:
      bowtie-build -f hg38_rRNA.fa hg38_rRNA
    2. Align the reads to rRNA reference to rule out the reads originating from rRNA:
      for i in si-Ctrl-1 si-Ctrl-2 si-Ctrl-3 si-eIF3e-1 si-eIF3e-2 si-eIF3e-3
      do
      bowtie -n 0 -y -a --norc --best --strata -S -p 4 -l 15 \
      --un=./${i}_noncontam.fastq hg38_rRNA -q ${i}.fastq ${i}.aln
      done
      -p specifies the number of threads for parallelly running the tasks. Considering the relatively small size of the RPF reads, other arguments (e.g., -n, -y, -a, -norc, --best, --strata, and -l) should be specified to guarantee that the reported alignments are best. For more details, refer to the Bowtie website26.

4. Align the clean reads to the genome

  1. Create a genome index.
    mkdir STAR_hg38_genome
    STAR --runThreadN 8 --runMode genomeGenerate --genomeDir ./STAR_hg38_genome --genomeFastaFiles Homo_sapiens.GRCh38.dna.primary_assembly.fa --sjdbGTFfile Homo_sapiens.GRCh38.104.gtf
  2. Align the clean reads (no rRNA contamination) to the created reference.
    for i in si-Ctrl-1 si-Ctrl-2 si-Ctrl-3 si-eIF3e-1 si-eIF3e-2 si-eIF3e-3
    do
    STAR --runThreadN 8 --outFilterType Normal --outWigType wiggle --outWigStrand Stranded --outWigNorm RPM --outFilterMismatchNmax 1 --outFilterMultimapNmax 1 --genomeDir STAR_hg38_genome --readFilesIn ${i}_noncontam.fastq --outFileNamePrefix ${i}. --outSAMtype BAM SortedByCoordinate --quantMode TranscriptomeSAM GeneCounts --outSAMattributes All
    done
    NOTE: An untemplated nucleotide is frequently added to the 5' end of each read by the reverse transcriptase27, which will be efficiently trimmed off by STAR as it performs soft-clipping by default. The parameters for STAR are described in STAR manual28.
  3. Sort and index alignment files.
    for i in si-Ctrl-1 si-Ctrl-2 si-Ctrl-3 si-eIF3e-1 si-eIF3e-2 si-eIF3e-3
    do
    samtools sort -T ${i}.Aligned.toTranscriptome.out.sorted \
    -o ${i}.Aligned.toTranscriptome.out.sorted.bam \
    ${i}.Aligned.toTranscriptome.out.bam
    samtools index ${i}.Aligned.toTranscriptome.out.sorted.bam
    samtools index ${i}.Aligned.sortedByCoord.out.bam
    done

5. Size selection of RPFs and identification of their P-sites

  1. Prepare the transcript annotations.
    prepare_transcripts -g Homo_sapiens.GRCh38.104.gtf \
    -f Homo_sapiens.GRCh38.dna.primary_assembly.fa -o RiboCode_annot
    NOTE: This command collects required information of mRNA transcripts from the GTF file and extracts the sequences for all mRNA transcripts from the FASTA file (each transcript is assembled by merging the exons according to the structures defined in the GTF file).
  2. Select RPFs of specific lengths and identify their P-site positions.
    for i in si-Ctrl-1 si-Ctrl-2 si-Ctrl-3 si-eIF3e-1 si-eIF3e-2 si-eIF3e-3
    do
    metaplots -a RiboCode_annot -r ${i}.Aligned.toTranscriptome.out.bam \
    -o ${i} -f0_percent 0.35 -pv1 0.001 -pv2 0.001
    done
    NOTE: This command plots the aggregate profiles of the 5' end of the aligned reads of each length around annotated translation start (or stop) codons. The read length-dependent P-site can be manually determined by examining the distribution plots (e.g., Figure 1B) of offset distances between 5' ends of the major reads and the start codon. RiboCode also generates a configuration file for each sample, in which the P-site positions of reads displaying significant 3-nt periodicity patterns are automatically determined. The parameters -f0_percent, -pv1, and -pv2 define the proportion threshold and p-value cutoffs for selecting the RPF reads enriched in the reading frame. In this example, the +12, +13, and +13 nucleotides from the 5' end of the 29, 30, and 31 nt reads are manually defined in each configuration file.
  3. Edit the configuration files for each sample and merge them
    NOTE: To generate a consensus set of unique ORFs and ensure sufficient coverage of reads to perform subsequent analysis, the selected reads of all samples in the previous step are merged. The reads of specific lengths defined in merged_config.txt file (Supplemental File 1) and their P-site information are used for evaluating the translation potential of ORFs in the next step.

6. De novo annotate translating ORFs

  1. Run RiboCode.
    RiboCode -a RiboCode_annot -c merged_config.txt -l yes -g \
    -o RiboCode_ORFs_result -s ATG -m 5 -A CTG,GTG,TTG

    Where the important parameters of this command are as follows:
    -c, configuration file containing the path of input files and the information of selected reads and their P-sites.
    -l, for transcripts having multiple start codons upstream of the stop codons, whether the longest ORFs (the region from the most distal start codon to stop codon) are used for evaluating their translation potential. If set to no, the starting codons will be automatically determined.
    -s, the canonical start codon(s) used for ORFs identification.
    -A, (optionally) the noncanonical start codons (e.g., CTG, GTG, and TTG for human) used for ORF identification, which may differ in mitochondria or nucleus of other species29.
    -m, the minimum length (i.e., amino acids) of ORFs.
    -o, the prefix of output filename containing the details of predicted ORFs (Supplemental File 2).
    -g and -b, output the predicted ORFs to gtf or bed format, respectively.

7. (Optional) ORF quantification and statistics

  1. Count RPF reads in each ORF.
    for i in si-Ctrl-1 si-Ctrl-2 si-Ctrl-3 si-eIF3e-1 si-eIF3e-2 si-eIF3e-3
    do
    ORFcount -g RiboCode_ORFs_result_collapsed.gtf \
    -r ${i}.Aligned.sortedByCoord.out.bam -f 15 -l 5 -m 25 -M 35 \
    -o ${i}_ORF.counts -s yes -c intersection-strict
    done
    NOTE: To exclude the potential accumulating ribosomes around the start and ends of ORFs, the number of reads allocated in the first 15 (specified by -f) and last 5 codons (specific by -l) are not counted. Optionally, the lengths of counted RPFs are restricted to the range from 25 to 35 nt (common sizes of RPFs).
  2. Calculate basic statistics of the detected ORFs using RiboCode:
    Rscript RiboCode_utils.R
    NOTE: RiboCode_utils.R (Supplemental File 3) provides a series of statistics for the RiboCode output, e.g., counting the number of identified ORFs, viewing the distribution of ORF lengths, and calculating the normalized RPF densities (i.e., RPKM, reads per kilobase per million mapped reads).

8. (Optional) Visualization of the predicted ORFs

  1. Obtain the relative positions of the start and stop codons for the desired ORF (e.g., ENSG00000100902_35292349_35292552_67) on its transcript from RiboCode_ORFs_result_collapsed.txt (Supplemental file 3). Then, plot the density of RPF reads in the ORF:
    plot_orf_density -a RiboCode_annot -c merged_config.txt -t ENST00000622405 \
    -s 33 -e 236 --start-codon ATG -o ENSG00000100902_35292349_35292552_67
    Where -s and -e specify the translation start and stop position of plotting ORF. --start-codon defines the start codon of the ORF, which will appear in the figure title. -o defines the prefix of the output file name.

9. (Optional) Metagene analysis using RiboMiner

NOTE: Perform the metagene analysis to assess the influence of EIF3E knockdown on the translation of identified annotated ORFs, following the steps below:

  1. Generate transcripts annotations for RiboMiner, which extracts the longest transcript for each gene based on the annotation file generated by RiboCode (step 5.1).
    OutputTranscriptInfo -c RiboCode_annot/transcripts_cds.txt \
    -g Homo_sapiens.GRCh38.104.gtf -f RiboCode_annot/transcripts_sequence.fa \
    -o longest.transcripts.info.txt -O all.transcripts.info.txt
  2. Prepare the configuration file for RiboMiner. Copy the configuration file generated by the metaplots command of RiboCode (step 5.4) and rename it "RiboMiner_config.txt." Then, modify it according to the format shown in Supplemental file 4.
  3. Metagene analyses using RiboMiner
    1. Use MetageneAnalysis to generate an aggregate and averaged profile of RPFs' densities across transcripts.
      MetageneAnalysis -f RiboMiner_config.txt -c longest.transcripts.info.txt \
      -o MA_normed -U codon -M RPKM -u 100 -d 400 -l 100 -n 10 -m 1 -e 5 --norm yes \
      -y 100 --type UTR
      Where important parameters are: --type, analyzing either CDS or UTR regions; --norm, whether normalized the read density; -y, the number of codons used for each transcript; -U, plot RPF density either at codon level or nt level; -u and -d, define the range of analyzing regions relative to start codon or stop codon; -l, the minimum length (i.e., the number of codons) of CDS; -M, the mode for transcripts filtering, either counts or RPKM; -n minimum counts or RPKM in CDS for analysis. -m minimum counts or RPKM of CDS in the normalized region; -e, the number of codons excluded from the normalized region.
    2. Generate a set of pdf files for comparing the ribosome occupancies on mRNA in control cells and eIF3-deficient cells.
      PlotMetageneAnalysis -i MA_normed_dataframe.txt -o MA_normed \
      -g si-Ctrl,si-eIF3e -r si-Ctrl-1,si-Ctrl-2,si-Ctrl-3__si-eIF3e-1,si-eIF3e-2,si-eIF3e-3 -u 100 -d 400 --mode mean
      NOTE: PlotMetageneAnalysis generates the set of pdf files. Details about the usage of MetageneAnalysis and PlotMetageneAnalysis are available at RiboMiner website30.

Results

The example ribosome profiling datasets were deposited in the GEO database under the accession number GSE131074. All the files and codes used in this protocol are available from Supplemental files 1-4. By applying RiboCode to a set of published ribosome profiling datasets23, we identified the novel ORFs actively translated in MCF-10A cells treated with control and EIF3E siRNAs. To select the RPF reads that are most likely bound by the tra...

Discussion

Ribosome profiling offers an unprecedented opportunity to study the ribosomes' action in cells at a genome scale. Precisely deciphering the information carried by the ribosome profiling data could provide insight into which regions of genes or transcripts are actively translating. This step-by-step protocol provides guidance on how to use RiboCode to analyze ribosome profiling data in detail, including package installation, data preparation, command execution, result explanation, and data visualization. The analysis resu...

Disclosures

The authors have no conflicts of interest to disclose.

Acknowledgements

The authors would like to acknowledge the support from the computational resources provided by the HPCC platform of Xi'an Jiaotong University. Z.X. gratefully thanks the Young Topnotch Talent Support Plan of Xi'an Jiaotong University.

Materials

NameCompanyCatalog NumberComments
A computer/server running LinuxAny--
Anaconda or MinicondaAnaconda-Anaconda: https://www.anaconda.com; Miniconda:https://docs.conda.io/en/latest/miniconda.html
RR Foundation-https://www.r-project.org/
RstudioRstudio-https://www.rstudio.com/

References

  1. Eisenberg, A. R., et al. Translation Initiation Site Profiling Reveals Widespread Synthesis of Non-AUG-Initiated Protein Isoforms in Yeast. Cell Systems. 11 (2), 145-160 (2020).
  2. Spealman, P., et al. Conserved non-AUG uORFs revealed by a novel regression analysis of ribosome profiling data. Genome Research. 28 (2), 214-222 (2018).
  3. Ingolia, N. T., Lareau, L. F., Weissman, J. S. Ribosome profiling of mouse embryonic stem cells reveals the complexity and dynamics of mammalian proteomes. Cell. 147 (4), 789-802 (2011).
  4. Bazzini, A. A., et al. Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation. The EMBO Journal. 33 (9), 981-993 (2014).
  5. Ingolia, N. T., et al. Ribosome profiling reveals pervasive translation outside of annotated protein-coding genes. Cell Reports. 8 (5), 1365-1379 (2014).
  6. Chew, G. L., Pauli, A., Schier, A. F. Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish. Nature Communications. 7, 11663 (2016).
  7. Zhang, H., et al. Determinants of genome-wide distribution and evolution of uORFs in eukaryotes. Nature Communications. 12 (1), 1076 (2021).
  8. Guenther, U. P., et al. The helicase Ded1p controls use of near-cognate translation initiation codons in 5' UTRs. Nature. 559 (7712), 130-134 (2018).
  9. Goldsmith, J., et al. Ribosome profiling reveals a functional role for autophagy in mRNA translational control. Communications Biology. 3 (1), 388 (2020).
  10. Magny, E. G., et al. Conserved regulation of cardiac calcium uptake by peptides encoded in small open reading frames. Science. 341 (6150), 1116-1120 (2013).
  11. Stumpf, C. R., Moreno, M. V., Olshen, A. B., Taylor, B. S., Ruggero, D. The translational landscape of the mammalian cell cycle. Molecular Cell. 52 (4), 574-582 (2013).
  12. Gerashchenko, M. V., Lobanov, A. V., Gladyshev, V. N. Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress. Proceedings of the National Academy of Sciences of the United States of America. 109 (43), 17394-17399 (2012).
  13. Andreev, D. E., et al. Oxygen and glucose deprivation induces widespread alterations in mRNA translation within 20 minutes. Genome Biology. 16, 90 (2015).
  14. Chng, S. C., Ho, L., Tian, J., Reversade, B. ELABELA: a hormone essential for heart development signals via the apelin receptor. Developmental Cell. 27 (6), 672-680 (2013).
  15. Pauli, A., et al. Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science. 343 (6172), 1248636 (2014).
  16. Stark, A., et al. Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature. 450 (7167), 219-232 (2007).
  17. Lin, M. F., Jungreis, I., Kellis, M. PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics. 27 (13), 275-282 (2011).
  18. Slavoff, S. A., et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nature Chemical Biology. 9 (1), 59-64 (2013).
  19. Schwaid, A. G., et al. Chemoproteomic discovery of cysteine-containing human short open reading frames. Journal of the American Chemical Society. 135 (45), 16750-16753 (2013).
  20. Ingolia, N. T., Brar, G. A., Rouskin, S., McGeachy, A. M., Weissman, J. S. Genome-wide annotation and quantitation of translation by ribosome profiling. Current Protocols in Molecular Biology. , 1-19 (2013).
  21. Ingolia, N. T., Ghaemmaghami, S., Newman, J. R., Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 324 (5924), 218-223 (2009).
  22. Xiao, Z., et al. De novo annotation and characterization of the translatome with ribosome profiling data. Nucleic Acids Research. 46 (10), 61 (2018).
  23. Lin, Y., et al. eIF3 Associates with 80S Ribosomes to Promote Translation Elongation, Mitochondrial Homeostasis, and Muscle Health. Molecular Cell. 79 (4), 575-587 (2020).
  24. . AGAT: Another Gff Analysis Toolkit to handle annotations in any GTF/GFF format Available from: https://agat.readthedocs.io/en/latest/gff_to_gtf.html (2020)
  25. . Gene Expression Omnibus Available from: https://www.ncbi.nim.nih.gov/geo (2002)
  26. Ingolia, N. T., Brar, G. A., Rouskin, S., McGeachy, A. M., Weissman, J. S. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nature Protocols. 7 (8), 1534-1550 (2012).
  27. . STAR manual Available from: https://github.com/alexdobin/STAR/blob/master/doc/STARmanual.pdf (2022)
  28. . The genetic codes Available from: https://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi (2019)
  29. . RiboMiner Available from: https://github.com/xryanglab/RiboMiner (2020)
  30. Ingolia, N. T., Hussmann, J. A., Weissman, J. S. Ribosome profiling: global views of translation. Cold Spring Harbor Perspectives in Biology. 11 (5), 032698 (2018).
  31. Lee, S., et al. Global mapping of translation initiation sites in mammalian cells at single-nucleotide resolution. Proceedings of the National Academy of Sciences of the United States of America. 109 (37), 2424-2432 (2012).
  32. Gao, X., et al. Quantitative profiling of initiating ribosomes in vivo. Nature Methods. 12 (2), 147-153 (2015).
  33. Spealman, P., Naik, A., McManus, J. uORF-seqr: A Machine Learning-Based approach to the identification of upstream open reading frames in yeast. Methods in Molecular Biol. 2252, 313-329 (2021).
  34. . RiboCode Available from: https://github.com/xryanglab/RiboCode (2018)
  35. Sharma, P., Wu, J., Nilges, B. S., Leidel, S. A. Humans and other commonly used model organisms are resistant to cycloheximide-mediated biases in ribosome profiling experiments. Nature Communications. 12 (1), 5094 (2021).

Reprints and Permissions

Request permission to reuse the text or figures of this JoVE article

Request Permission

Explore More Articles

Ribosome ProfilingMRNA TranslationRiboCodeTranslational RegulationGenomic RegionsNovel PeptidesComputational ProcedureConda EnvironmentEnsembl WebsiteFASTA DownloadHuman GenomeRRNA SequencesUCSC Genome BrowserSequence Read ArchiveControl Group SamplesRRNA Contamination

This article has been published

Video Coming Soon

JoVE Logo

Privacy

Terms of Use

Policies

Research

Education

ABOUT JoVE

Copyright © 2025 MyJoVE Corporation. All rights reserved