EdgeR - Differential Gene Expression Analysis using RNA-seq data - R Tutorial

preview_player
Показать описание
#EdgeR​ #RNAseq​ #DEG #plot #scatterplot #LogFC​​ #R​ #Bioinformatics​ #Bigdata​ #Datascience #English #USA #England #UK

EdgeR is an R package used to analyze data from high-throughput sequencing assays such as RNA-Seq for differential gene expression.

Following steps should be followed if you want to reproduce the data for better understanding.

Installation

if (!requireNamespace("BiocManager", quietly = TRUE))
BiocManager::install(version = "3.12"

if (!requireNamespace("BiocManager", quietly = TRUE))
BiocManager::install()

BiocManager::install ("edgeR")
library(edgeR)

setwd("D:/Post_PhD_Research/Expression Analysis/EdgeR")

load("mobData.RData")
head(mobData)
mobDataGroups (write angle bracket and “-” here) c("MM", "MM", "WM", "WM", "WW", "WW")

# MM="triple mutatnt shoot grafted onto triple mutant root"
# WM="wild-type shoot grafted onto triple mutant root"
# WW="wild-type shoot grafted onto wild-type root"

data
head(data)
d (write angle bracket and “-” here) DGEList(counts=mobData,group=factor(mobDataGroups))
d

Filtering the data
dim(d)

head(d$counts)
head(cpm(d))
apply(d$counts, 2, sum) # total gene counts per sample

keep (write angle bracket and “-” here) rowSums(cpm(d)(write “greater than” sign here)100) (write “greater than” sign here)= 2
d (write angle bracket and “-” here) d[keep,]
dim(d)
d$samples

Normalizing the data
d (write angle bracket and “-” here) calcNormFactors(d)
d

Data Exploration

Estimating the Dispersion
d1 (write angle bracket and “-” here) estimateCommonDisp(d, verbose=T)
names(d1)

d1 (write angle bracket and “-” here) estimateTagwiseDisp(d1)
names(d1)
plotBCV(d1)

GLM estimates of dispersion

plotBCV(d2)

Differential Expression
et12 (write angle bracket and “-” here) exactTest(d1, pair=c(1,2)) # compare groups 1 and 2
et13 (write angle bracket and “-” here) exactTest(d1, pair=c(1,3)) # compare groups 1 and 3
et23 (write angle bracket and “-” here) exactTest(d1, pair=c(2,3)) # compare groups 2 and 3
topTags(et12, n=10)
summary(de1)
abline(h = c(-2, 2), col = "blue")

Рекомендации по теме
Комментарии
Автор

Seriously ? can no-one do a normal explanation ? Like talking with the mouth to explain things instead of showing text in a word file ? I really don't understand the R community... Any other language you can find plenty of good tutorials explaining everything in details. What is the problem of R ? please somebody do something about it. 😭

someone_there
Автор

I wish there was sound instead of the music... Could you explain those plots?

scichores
Автор

Ohh.... I am a completly beginner and still clueless 😭😭😭😭😭

rereh-c
Автор

Hi can you elaborate which dataset you have taken and where can i find the paper related to the data u have used

HopeOverDebt
Автор

Hi. Why this music instead of the sound?

hediatnani
Автор

The GEO dataset analyzed by high throughput sequencing don't have three three replicated in that case how we can calculate the P-Value and LogFc value ? Thanks in advance for your sincere cooperation

waqarali
Автор

I want have some other quaries regarding DEG analysis.I want to compare two datasets differentially expressed gene, how can i do that.For example one data set contain 108 DEG and the other contain 70 so i want to see the common gene between this two dataset.So how can i do that and how can i make the vaan diagram between them.Moreover i saw some GEO dataset there are some file format tsv and txt.Son in that case how can i analyse that kind of file.Plz solve this two problem to me.

johirislam
Автор

Hello sir, can we use only two data like one is treated and another is control for differential gene expression analysis in edge R. Because I have no replicates. If any suggestions ?

animeshpattnaik
Автор

sir can pls you suggest algorithms works on micro array analysis?

kusumahosalli