Our protocol can be used to improve the quality and quantity of gene expression data generated from FFPE-derived RNA samples, especially in suboptimal quality samples. This technique assesses the RNA quality within FFPE tissue samples and enables optimization of their downstream processing to improve the quantity and quality of the sample sequencing data. This method allow the comprehensive characterization of a transcript in biological and clinical sample and may provide insight into functional elements of the genome in development and disease.
The FFPE-RNA degradation and selection of the methods for sample QC, library preparation, and data analysis can all be very tricky. Be meticulous when selecting the right methods and appropriate controls. Visual demonstration allows viewers to observe the small subtle modifications and precautions needed during the planning, assessment, and the sequencing process especially for poorly preserved FFPE samples.
To check the quality of the RNA sample, run the sample on an RNA QC system according to the manufacturer's instructions and use the appropriate bioanalysis software to analyze the distribution of the RNA fragments within the sample and to calculate the proportion of fragments longer than 100 and 200 nucleotides. Based on the QC metrics, identify the samples with fewer than 40%of the fragments longer than 100 nucleotides to allow exclusion of these samples from processing. For sequencing, thaw the appropriate sequencing reagent kits according to the run parameters in a room temperature water bath and place the reagent tray at four degrees Celsius after thawing.
Place the flow cell package at room temperature for 30 minutes. While the package is warming, open the Illumina Experiment Manager application and select Create Sample Sheet. Select the sequencer to be used and click Next.
Enter the reagent kit barcode and the other appropriate workflow parameters. Then upload a sample sheet created according to the Illumina sequence criteria. To prepare the sample library and the control library PhiX, denature and dilute the libraries to the appropriate concentrations and mix the sample and PhiX control libraries to obtain a 5%PhiX control volume ratio.
When the flow cells are ready, remove the wrapping, clean the glass surface with a lint-free alcohol wipe, and dry the glass with a low lint laboratory tissue. Remove the reagent cartridge from four degrees Celsius storage and invert the cartridge five times to mix the reagents. Gently tap the cartridge on the bench to reduce air bubbles and load the denatured and diluted sample into the reagent cartridge in the designated reservoir.
Load the flow cell, buffer cartridge, and the reagent cartridge onto the system. Then perform an automated check and review the output to ensure that the run parameters pass the system check. When the automated check is complete, select start to begin the sequencing run.
To visualize the pre-processing QC and post-alignment QC results, run multi-QC to generate an aggregated report in HTML format. To determine batch effects and to assess the quality map of the given dataset, use an R script to run a principal component analysis. Sample correlation analysis can then be carried out using the Pearson correlation between different samples.
The estimation of the proportion of fragments longer than 100 nucleotides is more useful than an estimation of the fragments longer than 200 nucleotides because it is more sensitive for accurately measuring the proportion of smaller fragment sizes for highly degraded RNA samples. Given the high degree of degradation in the sample set, a total RNA library preparation method is recommended. For example, here sequencing libraries prepared from four different samples are shown.
Here, overviews of raw FastQ file sequencing quality and sample adapter content are shown. FastQC screening can help detect contamination such as bacterial or mouse contamination within the samples. STAR alignment can be used to determine the proportion of reads mapped to the reference genome, the percentage of reads uniquely mapped to the reference genome, and the proportion of reads that were not mapped, or that were mapped to multiple loci.
The card and RSeQC statistics can be used to determine the percent of messenger RNAs, intronic and intergenic bases present in the mapped reads. For these representative analyses, some degraded libraries had a three prime bias in which more reads were mapped closer to the three prime than to the five prime end. In addition, principal component analysis can be performed to determine the percent of the total variation captured by the principal components for the biological replicates of each sample.
Take care to avoid sample degradation, prepare replicates and multiple coats for each sample, use the appropriate metrics to assess the sample quality, and use the right library preparation method. Using this methodology, larger NGS studies can be performed with previously unusable FFPE samples with varying storage times and conditions to gain insights into a variety of different disease states. This technique paved the way for researchers to study existing large archives of FFPE sample with rich clinical information and can greatly enhance population-based cancer studies.