需要订阅 JoVE 才能查看此. 登录或开始免费试用。
选择性剪接(AS)和替代聚腺苷酸化(APA)扩大了转录本亚型及其产物的多样性。在这里,我们描述了生物信息学协议,以分析批量RNA-seq和3'末端测序测定,以检测和可视化不同实验条件下变化的AS和APA。
除了对RNA-Seq进行典型分析以测量实验/生物学条件下的差异基因表达(DGE)外,RNA-seq数据还可用于探索外显子水平的其他复杂调控机制。选择性剪接和聚腺苷酸化通过产生不同的亚型来调节转录后水平的基因表达,在基因的功能多样性中起着至关重要的作用,并且将分析限制在整个基因水平上可能会错过这一重要的调控层。在这里,我们演示了详细的分步分析,以使用Bioconductor和其他封装和功能(包括DEXSeq,Limma封装的diffSplice和rMATS)来识别和可视化不同条件下的差异外显子和聚腺苷酸化位点的使用。
多年来,RNA-seq已被广泛用于估计差异基因表达和基因发现1。此外,它还可用于估计由于表达不同亚型的基因而导致的不同外显子水平使用情况,从而有助于更好地了解转录后水平的基因调控。大多数真核基因通过交替剪接(AS)产生不同的亚型,以增加mRNA表达的多样性。AS事件可分为不同的模式:跳过完全外显子(SE),其中("盒式")外显子与其侧翼内含子一起从转录本中完全去除;当外显子两端存在两个或多个剪接位点时,备选(供体)5'剪接位点选择(A5SS)和备选方案3'(受体)剪接位点选择(A3SS);当内含子保留在成熟的mRNA转录本中时保留内含子(RI)和相互排除外显子使用(MXE),其中一次只能保留两个可用外显子中的一个2,3。替代聚腺苷酸化(APA)在使用替代聚(A)位点从单个转录本产生多种mRNA亚型4的基因表达中也起着重要作用。大多数聚腺苷酸化位点(pA)位于3'非翻译区域(3'UTR),产生具有不同3'UTR长度的mRNA亚型。由于 3' UTR 是识别调控元件的中心枢纽,因此不同的 3' UTR 长度会影响 mRNA 的定位、稳定性和翻译5。有一类 3' 末端测序测定经过优化,可检测 APA,这些APA在协议6的细节上有所不同。此处描述的管道是为 PolyA-seq 设计的,但可以适用于所述的其他协议。
在这项研究中,我们提出了一系列差异外显子分析方法7,8 (图1),可分为两大类:基于外显子(DEXSeq9,diffSplice10)和基于事件的(复制转录本剪接的多变量分析(rMATS)11)。基于外显子的方法将单个外显子条件下的倍数变化与整体基因折叠变化的度量进行比较,以调用差异表达的外显子使用情况,并由此计算AS活性的基因水平测量。基于事件的方法使用外显子-内含子跨越结读取来检测和分类特定的剪接事件,例如外显子跳跃或内含子保留,并在输出3中区分这些AS类型。因此,这些方法为AS12,13的完整分析提供了补充观点。我们选择了DEXSeq(基于DESeq214 DGE封装)和diffSplice(基于Limma10 DGE封装)进行研究,因为它们是差分拼接分析中使用最广泛的软件包之一。rMATS被选为基于事件的分析的常用方法。另一种流行的基于事件的方法是MISO(亚型混合物)1。对于APA,我们采用基于外显子的方法。
图1.分析管道。 分析中使用的步骤的流程图。步骤包括:获取数据,执行质量检查和读取对齐,然后使用已知外显子,内含子和pA位点的注释对读取进行计数,过滤以去除低计数和标准化。使用diffSplice/DEXSeq方法分析PolyA-seq数据以寻找替代pA位点,使用diffSplice/DEXseq方法分析外显子水平的替代剪接的体RNA-Seq,并使用rMATS分析AS事件。 请点击此处查看此图的大图。
本调查中使用的RNA-seq数据是从基因表达综合(GEO)(GSE138691)15获得的。我们使用本研究的小鼠RNA-seq数据,分为两个条件组:野生型(WT)和肌盲样1型敲除(Mbnl1 KO),每个重复三个。为了证明差异聚腺苷酸化位点的使用分析,我们获得了小鼠胚胎成纤维细胞(MEFs)PolyA-seq数据(GEO加入GSE60487)16。数据有四个条件组:野生型(WT),肌肉盲样1型/2型双敲除(Mbnl1 / 2 DKO),Mbnl 1 / 2 DKO与Mbnl3敲低(KD)和Mbnl1 / 2 DKO与Mbnl3对照(Ctrl)。每个条件组由两个仿行组成。
加入全球环境展望 | SRA 运行编号 | 示例名称 | 条件 | 复制 | 组织 | 测 序 | 读取长度 | |
核糖核酸序列 | GSM4116218 | SRR10261601 | Mbnl1KO_Thymus_1 | Mbnl1 淘汰赛 | 代表 1 | 胸腺 | 配对端 | 100 基点 |
GSM4116219 | SRR10261602 | Mbnl1KO_Thymus_2 | Mbnl1 淘汰赛 | 代表 2 | 胸腺 | 配对端 | 100 基点 | |
GSM4116220 | SRR10261603 | Mbnl1KO_Thymus_3 | Mbnl1 淘汰赛 | 代表 3 | 胸腺 | 配对端 | 100 基点 | |
GSM4116221 | SRR10261604 | WT_Thymus_1 | 野生型 | 代表 1 | 胸腺 | 配对端 | 100 基点 | |
GSM4116222 | SRR10261605 | WT_Thymus_2 | 野生型 | 代表 2 | 胸腺 | 配对端 | 100 基点 | |
GSM4116223 | SRR10261606 | WT_Thymus_3 | 野生型 | 代表 3 | 胸腺 | 配对端 | 100 基点 | |
3P-序列 | GSM1480973 | SRR1553129 | WT_1 | 野生型(WT) | 代表 1 | 小鼠胚胎成纤维细胞 (MEF) | 单端 | 40 基点 |
GSM1480974 | SRR1553130 | WT_2 | 野生型(WT) | 代表 2 | 小鼠胚胎成纤维细胞 (MEF) | 单端 | 40 基点 | |
GSM1480975 | SRR1553131 | DKO_1 | Mbnl 1/2 双淘汰赛 (DKO) | 代表 1 | 小鼠胚胎成纤维细胞 (MEF) | 单端 | 40 基点 | |
GSM1480976 | SRR1553132 | DKO_2 | Mbnl 1/2 双淘汰赛 (DKO) | 代表 2 | 小鼠胚胎成纤维细胞 (MEF) | 单端 | 40 基点 | |
GSM1480977 | SRR1553133 | DKOsiRNA_1 | Mbnl 1/2 双敲除与 Mbnl 3 siRNA (KD) | 代表 1 | 小鼠胚胎成纤维细胞 (MEF) | 单端 | 40 基点 | |
GSM1480978 | SRR1553134 | DKOsiRNA_2 | Mbnl 1/2 双敲除与 Mbnl 3 siRNA (KD) | 代表 2 | 小鼠胚胎成纤维细胞 (MEF) | 单端 | 36 基点 | |
GSM1480979 | SRR1553135 | DKONTsiRNA_1 | Mbnl 1/2 双敲除与非靶向 siRNA (Ctrl) | 代表 1 | 小鼠胚胎成纤维细胞 (MEF) | 单端 | 40 基点 | |
GSM1480980 | SRR1553136 | DKONTsiRNA_2 | Mbnl 1/2 双敲除与非靶向 siRNA (Ctrl) | 代表 2 | 小鼠胚胎成纤维细胞 (MEF) | 单端 | 40 基点 |
表 1.用于分析的RNA-Seq和PolyA-seq数据集摘要。
1. 安装分析中使用的工具和 R 包
conda install -c daler sratoolkit
conda install -c conda-forge parallel
conda install -c bioconda star bowtie fastqc rmats rmats2sashimiplot samtools fasterq-dump cutadapt bedtools deeptools
bioc_packages<- c("DEXSeq", "Rsubread", "EnhancedVolcano", "edgeR", "limma", "maser","GenomicRanges")
packages<- c("magrittr", "rtracklayer", "tidyverse", "openxlsx", "BiocManager")
#Install if not already installed
installed_packages<-packages%in% rownames(installed.packages())
installed_bioc_packages<-bioc_packages%in% rownames(installed.packages())
if(any(installed_packages==FALSE)) {
install.packages(packages[!installed_packages],dependencies=TRUE)
BiocManager::install(packages[!installed_bioc_packages], dependencies=TRUE)
}
2. 使用 RNA-seq 进行选择性剪接 (AS) 分析
seq 10261601 10261606 | parallel prefetch SRR{}
parallel -j 3 fastq-dump --gzip --skip-technical --read-filter pass --dumpbase --split-e --clip --origfmt {} :::
wget -nv -O annotation.gtf.gz http://ftp.ensembl.org/pub/release-103/gtf/mus_musculus/Mus_musculus.GRCm39.103.gtf.gz \ && gunzip -f annotation.gtf.gz
wget -nv -O genome.fa.gz http://ftp.ensembl.org/pub/release-103/fasta/mus_musculus/dna/Mus_musculus.GRCm39.dna.primary_assembly.fa.gz \ && gunzip -f genome.fa.gz
GTF=$(readlink -f annotation.gtf)
GENOME=$(readlink -f genome.fa)
mkdir fastqc_out
parallel "fastqc {} -o fastqc_out" ::: $RAW_DATA/*.fastq.gz
#Build STAR index
GDIR=STAR_indices
mkdir $GDIR
STAR --runMode genomeGenerate --genomeFastaFiles $GENOME --sjdbGTFfile $GTF --runThreadN 8 --genomeDir $GDIR
ODIR=results/mapping
mkdir -p $ODIR
#Align reads to the genome
for fq1 in $RAW_DATA/*R1.fastq.gz;
do
fq2=$(echo $fq1| sed 's/1.fastq.gz/2.fastq.gz/g');
OUTPUT=$(basename ${fq1}| sed 's/R1.fastq.gz//g');
STAR --genomeDir $GDIR \
--runThreadN 12 \
--readFilesCommand zcat \
--readFilesIn ${fq1}${fq2}\
--outFileNamePrefix $ODIR\/${OUTPUT} \
--outSAMtype BAM SortedByCoordinate \
--outSAMunmapped Within \
--outSAMattributes Standard
Done
Rscript prepare_mm_exon_annotation.R annotation.gtf
packages<- c("Rsubread","tidyverse", "magrittr", "EnhancedVolcano", "edgeR","openxlsx")
invisible(lapply(packages, library, character.only=TRUE))
load("mm_exon_anno.RData")
countData <- dir("bams", pattern=".bam$", full.names=T) %>%
featureCounts(annot.ext=anno,
isGTFAnnotationFile=FALSE,
minMQS=0,useMetaFeatures=FALSE,
allowMultiOverlap=TRUE,
largestOverlap=TRUE,
countMultiMappingReads=FALSE,
primaryOnly=TRUE,
isPairedEnd=TRUE,
nthreads=12)
# Non-specific filtering: Remove the exons with low counts
isexpr<- rownames(countData$counts)[rowSums(cpm(countData$counts)>1) >=3]
countData$counts<-countData$counts[rownames(countData$counts) %in%isexpr, ]
anno<-anno%>% filter(GeneID%in% rownames(countData$counts))
# Remove genes with only 1 site and NA in geneIDs
dn<-anno%>%group_by(GeneID)%>%summarise(nsites=n())%>% filter(nsites>1&!is.na(GeneID))
anno<-anno%>% filter(GeneID%in%dn$GeneID)
countData$counts<-countData$counts[rownames(countData$counts) %in%anno$GeneID, ]
library(DEXSeq)
sampleTable<-data.frame(row.names= c("Mbnl1KO_Thymus_1", "Mbnl1KO_Thymus_2", "Mbnl1KO_Thymus_3", "WT_Thymus_1", "WT_Thymus_2", "WT_Thymus_3"), condition= rep(c("Mbnl1_KO", "WT"),c(3,3)), libType= rep(c("paired-end")))
exoninfo<-anno[anno$GeneID%in% rownames(countData$counts),]
exoninfo<-GRanges(seqnames=anno$Chr,
ranges=IRanges(start=anno$Start, end=anno$End, width=anno$Width),strand=Rle(anno$Strand))
mcols(exoninfo)$TranscriptIDs<-anno$TranscriptIDs
mcols(exoninfo)$Ticker<-anno$Ticker
mcols(exoninfo)$ExonID<-anno$ExonID
mcols(exoninfo)$n<-anno$n
mcols(exoninfo)$GeneID<-anno$GeneID
transcripts_l= strsplit(exoninfo$TranscriptIDs, "\\,")
save(countData, sampleTable, exoninfo, transcripts_l, file="AS_countdata.RData")
dxd<-DEXSeqDataSet(countData$counts,sampleData=sampleTable, design=~sample+exon+condition:exon,featureID=exoninfo$ExonID,groupID=exoninfo$GeneID,featureRanges=exoninfo, transcripts=transcripts_l)
dxd %<>% estimateSizeFactors %>% estimateDispersions %T>% plotDispEsts
dxd%<>%testForDEU%>%estimateExonFoldChanges(fitExpToVar=
"condition")#Estimate fold changes
dxr=DEXSeqResults(dxd)
plotDEXSeq(dxr,"Wnk1", displayTranscripts=TRUE, splicing=TRUE,legend
=TRUE,cex.axis=1.2,cex=1.3,lwd=2)
library(limma)
library(edgeR)
mycounts=countData$counts
#Change the rownames of the countdata to exon Ids instead of genes for unique rownames.
rownames(mycounts) = exoninfo$ExonID
dge<-DGEList(counts=mycounts)
#Filtering
isexpr<- rowSums(cpm(dge)>1) >=3
dge<-dge[isexpr,,keep.lib.sizes=FALSE]
#Extract the exon annotations for only transcripts meeting non-specific filter
exoninfo=anno%>% filter(ExonID%in% rownames(dge$counts))
#Convert the exoninfo into GRanges object
exoninfo1<-GRanges(seqnames=exoninfo$Chr,
ranges=IRanges(start=exoninfo$Start, end=exoninfo$End, width=exoninfo$Width),strand=Rle(exoninfo$Strand))
mcols(exoninfo1)$TranscriptIDs<-exoninfo$TranscriptIDs
mcols(exoninfo1)$Ticker<-exoninfo$Ticker
mcols(exoninfo1)$ExonID<-exoninfo$ExonID
mcols(exoninfo1)$n<-exoninfo$n
mcols(exoninfo1)$GeneID<-exoninfo$GeneID
transcripts_l= strsplit(exoninfo1$TranscriptIDs, "\\,")
dge<-calcNormFactors(dge)
Treat<- factor(sampleTable$condition)
design<- model.matrix(~0+Treat)
colnames(design) <- levels(Treat)
v<-voom(dge,design,plot=FALSE)
fit<-lmFit(v,design)
fit<-eBayes(fit)
colnames(fit)
cont.matrix<-makeContrasts(
Mbnl1_KO_WT=Mbnl1_KO-WT,
levels=design)
fit2<-contrasts.fit(fit,cont.matrix)
ex<-diffSplice(fit2,geneid=exoninfo$GeneID,exonid=exoninfo$ExonID)
ts<-topSplice(ex,n=Inf,FDR=0.1, test="t", sort.by="logFC")
tg<-topSplice(ex,n=Inf,FDR=0.1, test="simes")
plotSplice(ex,geneid="Wnk1", FDR=0.1)
#Volcano plot
EnhancedVolcano(ts,lab=ts$ExonID,selectLab= head((ts$ExonID),2000), xlab= bquote(~Log[2]~'fold change'), x='logFC', y='P.Value', title='Volcano Plot', subtitle='Mbnl1_KO vs WT (Limma_diffSplice)', FCcutoff=2, labSize=4,legendPosition="right", caption= bquote(~Log[2]~"Fold change cutoff, 2; FDR 10%"))
mkdir rMATS_analysis
cd bams/
ls -pd "$PWD"/*| grep "WT"| tr '\n'','> Wt.txt
ls -pd "$PWD"/*| grep "Mb"| tr '\n'','> KO.txt
mv *.txt ../rMATS_analysis
python rmats-turbo/rmats.py --b1 KO.txt --b2 Wt.txt --gtf annotation.gtf -t paired --readLength 50 --nthread 8 --od rmats_out/ --tmp rmats_tmp --task pos
library(maser)
mbnl1<-maser("/rmats_out/", c("WT","Mbnl1_KO"), ftype="JCEC")
#Filtering out events by coverage
mbnl1_filt<-filterByCoverage(mbnl1,avg_reads=5)
#Top splicing events at 10% FDR
mbnl1_top<-topEvents(mbnl1_filt,fdr=0.1, deltaPSI=0.1)
mbnl1_top
#Check the gene events for a particular gene
mbnl1_wnk1<-geneEvents(mbnl1_filt,geneS="Wnk1", fdr=0.1, deltaPSI=0.1)
maser::display(mbnl1_wnk1,"SE")
plotGenePSI(mbnl1_wnk1,type="SE", show_replicates
=TRUE)
volcano(mbnl1_filt,fdr=0.1, deltaPSI=0.1,type="SE")
+xlab("deltaPSI")+ylab("Log10 Adj. Pvalue")+ggtitle("Volcano Plot of exon skipping events")
python ./src/rmats2sashimiplot/rmats2sashimiplot.py --b1 ../bams/WT_Thymus_1.bam,../bams/WT_Thymus_2.bam,../bams/WT_Thymus_3.bam --b2 ../bams/Mbnl1KO_Thymus_1.bam,../bams/Mbnl1KO_Thymus_2.bam,../bams/Mbnl1KO_Thymus_3.bam -t SE -e ../rMATS_analysis/rmats_out/SE.MATS.JC.txt --l1 WT --l2 Mbnl1_KO --exon_s 1 --intron_s 5 -o ../rMATS_analysis/rmats2shasmi_output
3. 使用 3' 末端测序的替代聚腺苷酸化 (APA) 分析
anno<- read.table(file= "flanking60added.pA_annotation.bed",
stringsAsFactors=FALSE, check.names=FALSE, header=FALSE, sep="")
colnames(anno) <- c("chrom", "chromStart", "chromEnd", "name", "score", "strand", "rep", "annotation", "gene_name", "gene_id")
anno<- dplyr::select(anno,name,chrom, chromStart,chromEnd, strand,gene_id,gene_name,rep)
colnames(anno) <- c("GeneID", "Chr", "Start", "End", "Strand", "Ensembl", "Symbol", "repID")
countData<- dir("bamfiles", pattern="sorted.bam$", full.names=TRUE) %>%
# Read all bam files as input for featureCounts
featureCounts(annot.ext=anno, isGTFAnnotationFile= FALSE,minMQS=0,useMetaFeatures= TRUE,allowMultiOverlap=TRUE, largestOverlap= TRUE,strandSpecific=1, countMultiMappingReads =TRUE,primaryOnly= TRUE,isPairedEnd= FALSE,nthreads=12)%T>%
save(file="APA_countData.Rdata")
load(file= "APA_countData.Rdata")# Skip this step if already loaded
# Non-specific filtering: Remove the pA sites not differentially expressed in the samples
countData<-countData$counts%>%as.data.frame%>% .[rowSums(edgeR::cpm(.)>1) >=2, ]
anno%<>% .[.$GeneID%in% rownames(countData), ]
# Remove genes with only 1 site and NA in geneIDs
dnsites<-anno%>%group_by(Symbol)%>%summarise(nsites=n())%>% filter(nsites>1&!is.na(Symbol))
anno<-anno%>% filter(Symbol%in%dnsites$Symbol)
countData<-countData[rownames(countData) %in%anno$GeneID, ]
c("DEXSeq", "GenomicRanges") %>% lapply(library, character.only=TRUE) %>%invisible
sampleTable1<- data.frame(row.names= c("WT_1","WT_2","DKO_1","DKO_2"),
condition= c(rep("WT", 2), rep("DKO", 2)),
libType= rep("single-end", 4))
# Prepare the GRanges object for DEXSeqDataSet object construction
PASinfo <- GRanges(seqnames = anno$Chr,
ranges = IRanges(start = anno$Start, end = anno$End),strand = Rle(anno$Strand))
mcols(PASinfo)$PASID<-anno$repID
mcols(PASinfo)$GeneEns<-anno$Ensembl
mcols(PASinfo)$GeneID<-anno$Symbol
# Prepare the new feature IDs, replace the strand information with letters to match the current pA site clusterID
new.featureID <- anno$Strand %>% as.character %>% replace(. %in% "+", "F") %>% replace(. %in% "-", "R") %>% paste0(as.character(anno$repID), .)
# Select the read counts of the condition WT and DKO
countData1<- dplyr::select(countData, SRR1553129.sorted.bam, SRR1553130.sorted.bam, SRR1553131.sorted.bam, SRR1553132.sorted.bam)
# Rename the columns of countData using sample names in sampleTable
colnames(countData1) <- rownames(sampleTable1)
dxd1<-DEXSeqDataSet(countData=countData1,
sampleData=sampleTable1,
design=~sample+exon+condition:exon,
featureID=new.featureID,
groupID=anno$Symbol,
featureRanges=PASinfo)
dxd1$condition<- factor(dxd1$condition, levels= c("WT", "DKO"))
# The contrast pair will be "DKO - WT"
dxd1 %<>% estimateSizeFactors %>% estimateDispersions %T>% plotDispEsts
dxd1 %<>% testForDEU %>% estimateExonFoldChanges(fitExpToVar = "condition")
dxr1 <- DEXSeqResults(dxd1)
dxr1
mcols(dxr1)$description
table(dxr1$padj<0.1) # Check the number of differential pA sites (FDR < 0.1)
table(tapply(dxr1$padj<0.1, dxr1$groupID, any)) # Check the number of gene overlapped with differential pA site
# Select the top 100 significant differential pA sites ranked by FDR
topdiff.PAS<- dxr1%>%as.data.frame%>%rownames_to_column%>%arrange(padj)%$%groupID[1:100]
# Apply plotDEXSeq for the visualization of differential polyA usage
plotDEXSeq(dxr1,"S100a7a", legend=TRUE, expression=FALSE,splicing=TRUE, cex.axis=1.2, cex=1.3,lwd=2)
# Apply perGeneQValue to check the top genes with differential polyA site usage
dxr1%<>% .[!is.na(.$padj), ]
dgene<- data.frame(perGeneQValue= perGeneQValue (dxr1)) %>%rownames_to_column("groupID")
dePAS_sig1<-dxr1%>% data.frame() %>%
dplyr::select(-matches("dispersion|stat|countData|genomicData"))%>%
inner_join(dgene)%>%arrange(perGeneQValue)%>%distinct()%>%
filter(padj<0.1)
# Apply EnhancedVolcano package to visualise differential polyA site usage
"EnhancedVolcano"%>% lapply(library, character.only=TRUE) %>%invisible
EnhancedVolcano(dePAS_sig1, lab=dePAS_sig1$groupID, x='log2fold_DKO_WT',
y='pvalue',title='Volcano Plot',subtitle='DKO vs WT',
FCcutoff=1,labSize=4, legendPosition="right",
caption= bquote(~Log[2]~"Fold change cutoff, 1; FDR 10%"))
contrast.matrix<-makeContrasts(DKO_vs_WT=DKO-
WT,Ctrl_vs_DKO=Ctrl-DKO,
KD_vs_Ctrl=KD-Ctrl,KD_vs_DKO=KD-DKO,levels=design)
fit2<-fit%>%contrasts.fit(contrast.matrix)%>%eBayes
summary(decideTests(fit2))
ex<-diffSplice(fit2,geneid=anno$Symbol,exonid=new.featureID)
topSplice(ex) #Check the top significant results with topSplice
sig1<-topSplice(ex,n=Inf,FDR=0.1,coef=1, test="t", sort.by="logFC")
sig1.genes<-topSplice(ex,n=Inf,FDR=0.1,coef=1, test="simes")
plotSplice(ex, coef=1,geneid="S100a7a", FDR = 0.1)
plotSplice(ex,coef=1,geneid="Tpm1", FDR = 0.1)
plotSplice(ex,coef=1,geneid="Smc6", FDR = 0.1)
EnhancedVolcano(sig1, lab=sig1$GeneID,xlab= bquote(~Log[2]~'fold change'),
x='logFC', y='P.Value', title='Volcano Plot', subtitle='DKO vs WT',
FCcutoff=1, labSize=6, legendPosition="right")
运行上述分步工作流程后,AS和APA分析输出和代表性结果以表格和数据图的形式生成,生成如下。
如:
AS分析的主要输出(差异拼接的补充表1;DEXSeq的表2)是显示不同条件差异用法的外显子列表,以及显示其一个或多个组成外显子的显着整体剪接活性的基因列表,按统计学显着性排名。补充表1,选项卡2显示了显着的外显子?...
在这项研究中,我们评估了基于外显子和基于事件的方法,以检测批量RNA-Seq和3'末端测序数据中的AS和APA。基于外显子的AS方法既产生差异表达的外显子列表,又产生按总体基因水平差异剪接活性的统计显着性排序的基因水平排名(表1-2,4-5)。对于diffSplice包,差异用法是通过在外显子水平上拟合加权线性模型来确定的,以估计外显子的差异对数倍数变化与同一基因内其他外显子的平...
作者没有什么可透露的。
这项研究得到了澳大利亚研究委员会(ARC)未来奖学金(FT16010043)和澳大利亚国立大学期货计划的支持。
Name | Company | Catalog Number | Comments |
Not relevent for computational study |
请求许可使用此 JoVE 文章的文本或图形
请求许可探索更多文章
This article has been published
Video Coming Soon
版权所属 © 2025 MyJoVE 公司版权所有,本公司不涉及任何医疗业务和医疗服务。