A subscription to JoVE is required to view this content. Sign in or start your free trial.
Method Article
Sequence specificity is critical for gene regulation. Regulatory proteins that recognize specific sequences are important for gene regulation. Defining functional binding sites for such proteins is a challenging biological problem. An iterative approach for identification of a binding site for an RNA-binding protein is described here and is applicable to all RNA-binding proteins.
Gene regulation plays an important role in all cells. Transcriptional, post-transcriptional (or RNA processing), translational, and post-translational steps are used to regulate specific genes. Sequence-specific nucleic acid-binding proteins target specific sequences to control spatial or temporal gene expression. The binding sites in nucleic acids are typically characterized by mutational analysis. However, numerous proteins of interest have no known binding site for such characterization. Here we describe an approach to identify previously unknown binding sites for RNA-binding proteins. It involves iterative selection and amplification of sequences starting with a randomized sequence pool. Following several rounds of these steps-transcription, binding, and amplification-the enriched sequences are sequenced to identify a preferred binding site(s). Success of this approach is monitored using in vitro binding assays. Subsequently, in vitro and in vivo functional assays can be used to assess the biological relevance of the selected sequences. This approach allows identification and characterization of a previously unknown binding site(s) for any RNA-binding protein for which an assay to separate protein-bound and unbound RNAs exists.
In cell biology, gene regulation plays a central role. At one or multiple steps along the gene expression pathway, genes have the potential to be regulated. These steps include transcription (initiation, elongation, and termination) as well as splicing, polyadenylation or 3’ end formation, RNA export, mRNA translation, and decay/localization of primary transcripts. At these steps, nucleic acid-binding proteins modulate gene regulation. Identification of binding sites for such proteins is an important aspect of studying gene control. Mutational analysis and phylogenetic sequence comparison have been used to discover regulatory sequences or protein-binding sites in nucleic acids, such as promoters, splice sites, polyadenylation elements, and translational signals1,2,3,4.
Pre-mRNA splicing is an integral step during gene expression and regulation. The majority of mammalian genes, including those in humans, have introns. A large fraction of these transcripts is alternatively spliced, producing multiple mRNA and protein isoforms from the same gene or primary transcript. These isoforms have cell-specific and developmental roles in cell biology. The 5’ splice site, the branch-point, and the polypyrimidine-tract/3’ splice site are critical splicing signals that are subject to regulation. In negative regulation, an otherwise strong splice site is repressed, whereas in positive regulation an otherwise weak splice site is activated. A combination of these events produces a plethora of functionally distinct isoforms. RNA-binding proteins play key roles in these alternative splicing events.
Numerous proteins are known whose binding site(s) or RNA targets remain to be identified5, 6. Linking regulatory proteins to their downstream biological targets or sequences is often a complex process. For such proteins, identification of their target RNA or binding site is an important step in defining their biological functions. Once a binding site is identified, it can be further characterized using standard molecular and biochemical analyses.
The approach described here has two advantages. First, it can identify a previously unknown binding site for a protein of interest. Second, an added advantage of this approach is that it simultaneously allows saturation mutagenesis, which would otherwise be labor intensive to obtain comparable information about sequence requirements within the binding site. Thus, it offers a quicker, easier, and less costly tool to identify protein binding sites in RNA. Originally, this approach (SELEX or Systematic Evolution of Ligands by EXponential enrichment) was used to characterize the binding site for the bacteriophage T4 DNA polymerase (gene 43 protein), which overlaps with the ribosome binding site in its own mRNA. The binding site contains an 8-base loop sequence, representing 65,536 randomized variants for analysis7. Second, the approach was also independently used to show that specific binding sites or aptamers for different dyes can be selected from a pool of approximately 1013 sequences8. In fact, this approach has been broadly used in many different contexts to identify aptamers (RNA or DNA sequences) for binding numerous ligands, such as proteins, small molecules, and cells, and for catalysis9. As an example, an aptamer can discriminate between two xanthine derivatives, caffeine and theophylline, which differ by the presence of one methyl group in caffeine10. We have extensively used this approach (SELEX or iterative selection-amplification) to study how RNA-binding proteins function in splicing or splicing regulation11, which will be the basis for the discussion below.
The random library: We used a random library of 31 nucleotides. The length consideration for the random library was loosely based on the idea that the general splicing factor U2AF65 binds to a sequence between the branch-point sequence and the 3’ splice site. On average, the spacing between these splicing signals in metazoans is in the range of 20 to 40 nucleotides. Another protein Sex-lethal was known to bind to a poorly characterized regulatory sequence near the 3’ splice site of its target pre-mRNA, transformer. Thus, we chose a random region of 31 nucleotides, flanked by primer binding sites with restriction enzyme sites to allow for PCR amplification and attachment of the T7 RNA polymerase promoter for in vitro transcription. The theoretical library size or complexity was 431 or approximately 1018. We used a small fraction of this library to prepare our random RNA pool (~1012-1015) for the experiments described below.
NOTE: Figure 1 provides a summary of key steps in the iterative selection-amplification (SELEX) process.
1. Generation of a random library template
2. Generation of the DNA random library pool
3. Synthesis of pool 0 RNA
4. Protein binding reaction and separation of bound RNA
5. Reverse transcription and PCR amplification
6. Transcription and protein binding
7. Analysis of RNA-protein interactions
8. Cloning and sequencing
9. Sequence alignment
The following observations demonstrate successful selection-amplification (SELEX). First, we analyzed pool 0 and the selected sequences for binding to the protein used for the iterative selection-amplification approach. Figure 2 shows that the mammalian polypyrimidine-tract binding protein (PTB) shows barely detectable binding to the pool 0 sequence but high affinity for the selected sequence pool. There was barely detectable binding to pool 0 when we used ab...
Nucleic acid-binding proteins are important regulators of animal and plant development. A key requirement for the SELEX procedure is the development of an assay that can be used to separate protein-bound and unbound RNA fractions. In principle, this assay can be an in vitro binding assay such as the filter-binding assay, the gel mobility shift assay, or a matrix binding assay19 for recombinant proteins, purified proteins, or protein complexes. The assay can also be an enzymatic assay where the pre...
The author declares that he has no competing financial interests.
The author thanks the National Institutes of Health for the past funding.
Name | Company | Catalog Number | Comments |
Gel Electrophoresis equipment | Standard | Standard | |
Glass Plates | Standard | Standard | |
Nitrocellulose | Millipore | HAWP | |
Nitrocellulose | Schleicher & Schuell | PROTRAN | |
polyacrylamide gel solutions | Standard | Standard | |
Proteinase K | NEB | P8107S | |
Recombinant PTB | Laboratory Preparation | Not applicable | |
Reverse Transcriptase | NEB | M0277S | |
Vacuum manifold | Fisher Scientific | XX1002500 | Millipore 25mm Glass Microanalysis Vacuum Filter |
Vacuum manifold | Millipore | XX2702552 | 1225 Sampling Vacuum Manifold |
X-ray films | Standard | Standard |
Request permission to reuse the text or figures of this JoVE article
Request PermissionThis article has been published
Video Coming Soon
Copyright © 2025 MyJoVE Corporation. All rights reserved