This protocol covers all essential steps needed when using the new method developed to directly sequencing RNA, regardless if the RNA sample is single-stranded, mixed, or modified. The method is not affected by enzymatic error or base complementarity. It provides a direct workflow and general solution that makes it possible to sequence different RNA modifications simultaneously rather than one specific modification at a time.
This technology can be developed into a diagnostic tool to determine signature RNA segments to relate it to human disease and an accurate tool to study RNA modifications at the epitranscriptomic level. The method does not rely on any previous sequencing method. Instead, it is a completely new approach, and we aim to make it to the go-to method for sequencing any modified RNA.
Some parts of this protocol contain critical aspects that can be conveyed more accurately by visual demonstration, such as labeling the 3'end of RNA LC-MS data acquisition and the data analysis by the MassHunter software. I'll demonstrate this procedure and set up the LC-MS and measure our RNA samples. To perform a one-step labeling reaction, combine 2 microliters of 150 micromolar AppCp-biotin, 3 microliters of 10x ligase reaction buffer, 1.5 microliters of 100 micromolar RNA sample, 3 microliters of anhydrous DMSO, 10 units of T4 RNA ligase, and 19.5 microliters of DEPC-treated water for a total volume of 30 microliters.
Incubate the reaction overnight at 16 degrees Celsius, then perform column purification according to manuscript directions. After dividing the RNA into three aliquots and adding an equal volume of formic acid, incubate the reactions at 40 degrees Celsius. Once each reaction is finished, immediately freeze the sample on dry ice to quench the acid degradation.
Use a centrifugal vacuum concentrator to dry the sample for approximately 30 minutes. Resuspend the dried samples in 20 microliters of DEPC-treated water and combine them. Store the samples at 20 degrees Celsius until LC-MS measurement.
Transfer the RNA sample to the LC-MS sample vial after preparing mobile phases. Each injection should contain 100 to 400 picomols of RNA and 20 microliters. After acquiring the data, use the molecular feature extraction workflow to extract compound information, including mass, retention time, volume, and quality score.
Use the centroid data format in small molecule settings. Set the peak height to at least 100, but less than 1, 000, and the quality score to at least 50. Finally, export the data as an Excel file.
The separation of the 3'ladder from the 5'ladder and other undesired fragments is shown here on a 2D mass retention time plot. It is also possible to sequence mixed samples containing two RNA strands of different lengths with a 5'biotin label at each RNA. To differentiate uridine from a pseudouridine, the RNA was treated with CMC, which converts a pseudouridine to a CMC-pseudouridine adduct.
The adduct has a different mass in a uridine and can be differentiated in the 2D-HELS MS Seq. The HPLC profile of the crude product of the reaction converting pseudouridine to its CMC-adduct was used to calculate the percent conversion, which was 42%A mixture of five different RNA strands was sequenced by the 2D-HELS MS Seq approach with 3'end labeling and normalized for better visualization of multiple RNA sequences. Without the normalization, the letter codes for the sequences of the five RNAs would be crowded together.
While attempting this procedure, set up the labeling reaction carefully to obtain a high yield of labeled RNA samples. Also, carefully monitor the degradation time to create reliable mass ladders and collect the maximum amount of dried samples with 20 microliters of water. A question that can be asked after performing this protocol is how to increase the width length and throughput of the sequencing method.
A better algorithm and a more advanced instrument with higher resolution and sensitivity can help. After the development of this protocol, it became possible to directly sequence a broader range of RNA samples containing both canonical and modified nucleotides without any error related to CDA-based RNA sequencing.