RNAseq tutorial – part 4 – Differential expression analysis with Deseq2

Показать описание

Here I use Deseq2 to perform differential gene expression analysis. I used a count table as input and I output a table of significantly differentially expressed genes. I also show PCA and dispersion QC of the RNAseq data.

The output data can be further manipulated and explored in R, python, or excel. E.g., you can extract positively enriched genes and sort by log-fold change. You can also use the Ensemble identifiers in gene ontology analysis directly. However, in future videos I will show the conversion of Ensemble IDs to gene symbol and show how to create heatmaps and other useful figures.

The samples include normal human cell control and replicative senescence cells from NCBI accession GSE171663

Deseq2 citation:
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8. PMID: 25516281; PMCID: PMC4302049.

Sanbomics

Рекомендации по теме

Комментарии

Thank you!!! This is the best DESeq2 tutorial so far. It's easy to follow and every step makes sense. I am sure many others are benefiting and will benefit from you! I hope you are having a wonderful day or night wherever you are! Thanks a lot!!!

khawa

Thanks mate, all the tutorials in this series have been top notch. They go at a great pace and I appreciate that you explain pretty much everything you're doing

SlugmaB

I found your video as a diamond in the pile of superb bro... Thanks a lot.

joyhoskeri

hey mate
Am getting this error
followed all your steps except the filter one
pls help
I have 5 columns in my matrix (M10, M11, M12, M3 and M5) and EMBSEL gene ids to it
The dds step is not working
Error in checkForExperimentalReplicates(object, modelMatrix) :

The design matrix has the same number of samples and coefficients to fit,
so estimation of dispersion is not possible. Treating samples
as replicates was deprecated in v1.20 and no longer

ParthShah-hcpw

Please tell how to collect sample data for GSE99816, how to know which sample is normal / diseased, Please help me sir.( i used geoquery but it didn't contain this information) please help

sanjaisrao

Hi, informative video. I want to know, how to deal with non-intergers data having decimal. The data matrix function doesn't work on such data set

ZahidHussain-xbit

Hi, Nice tutorial!!! Thank you so much. May I ask how to compare 3 or more groups with different sample sizes?

florawang

Thanks for this video. Please how can I reach out to you?

adekunleajiboye

Hello. I am getting an error. While running the DESeqDataSetFromMatrix function, an error pops up

Error in DESeqDataSet(se, design = design, ignoreRank) :
'design' should be a formula or a matrix

can you tell me how to solve this issue? My dataset consists of 8 columns (4 cancer+ 4 normal samples).

ragnulf_gamer

Thank you Mark for your all informative videos. Is there a way to produce RPKM/FPKM and TPM values from DESeq2 library and what’s the easiest way to obtain gene length?

ashwaqkhaled

Hi. Sir.
Thank you for your video.
Just quick question. Once we run DeSeq(dds) function, the generated results are based on normalized data? after you run "res = results(dds, contrast = c("condition", "S", "C"), you ve got 7 columns including log2foldchange. this log2foldchange is calculated based on normalized data? of course, we can get normalized data using estimaterSizeFactors(dds) followed by counts(dds, normalized=T). But, before this code, we just run dds function and then extract result.

freezingtolerance

Could you further clarify with regard to how you would pick a threshold for row sums. Not sure what you meant by "filter their end result by their mean".

adampassman

Informative video. Thanks
I have a query regarding data analysis if you could please help me in that. I have a data set for tumors that I downloaded from cancer data portal so now I have gene expression data and clinical data for both tumors. I want to compare the gene expression of both tumors but I am no getting from where I should start, how can I compare these tumors by using DESeq2. Please guide me. Thank you

munibabashir

Hi, could you pls help me on how to filter out only the protein coding genes? Thankyou.

rushonline

Hi thanks for the useful tutorial, how do we convert results (differential table) in to dds (DESeq output)? In a way we can apply the padj cut-off in the res -> dds -> vsdata. Or is there any other way to get padj cut-off applied dds? Thank you

anandhakumarchandran

Hi. I am really thankful for your videos. Atm i am in a pickle. I looked up results() function man page since i am a bit confused about this "contrast" argument. The confusion comes from the fact that i have 3 types of samples not just "s" and "c". Either the "contrast" has 1 vector with exactly 3 elements like in the video, (and here comes the confusion): or 2 vectors with names of the fold changes for the numerator, and names of the fold changes for the denominator. What are these? The 3rd option that contrast can contain is "a numeric contrast vector with one element for each element in resultsNames(object) (most general case". Should i use the 2nd or the 3rd option? and what these numerators and denominators mean here? Thank you really.

hatchet

what If I have 3 conditions instead of 2? When I try to run res <- results (and etc) I get an error saying " Error in checkContrast(contrast, resNames) : 'contrast', as a character vector of length 3, should have the form: contrast = c('factorName', 'numeratorLevel', 'denominatorLevel'), see the manual page of ?results for more information"

mirij

Help me
My coade: dds <- = counts, colData = coldata, design = ~condition)
and this error: Error in DESeqDataSet(se, design = design, ignoreRank) : some values in assay are not integers
Why?

MM-fjym

Hi there!
That's really the best Deseq2 tutorial I have seen so far, thank you very much!!
I have one question: I ran the first command that includes the header and row.names (row.names =1) but I get the following error message:
"Error in read.table(file = file, header = header, sep = sep, quote = quote, :
duplicate 'row.names' are not allowed"
I read a lot of sites that suggest to null the row.names but that is not a good idea for my data.
Have you ever encountered this error? Do you have any recommendations?
Thanks in advance!

julieapostolou

Hi, thanks for this informative video on DESeq2. I have been stuck for a while with input data matrix before running DESeq2. I can see that my Gene identifier column automatically becomes the first column when I am arranging the condition and coldata (column data of my htseq readcounts) into matrix format. Can you suggest me how do I fix it? Thanks!

diyabhattacharya

RNAseq tutorial – part 4 – Differential expression analysis with Deseq2

RNAseq tutorial – part 4 – Differential expression analysis with Deseq2

Beginner's RNA-Seq Tutorial Part 4 - Dr. Pedro Miura's BIOL792 Course, University of Nevad...

Bulk RNA-Seq Analysis Demystified: Step-by-Step Tutorial Part 4 rRNA removal by #SortMeRNA

RNAseq Process Part 4

BioJupies Tutorial Part 4 of 8 - Analyzing GEO Data

Bioinformatic Analysis of Single Cell Data - Part 4

BioJupies Tutorial Part 4 of 8 - Analyzing GEO Data

Anna Cuomo & Ximena Ibarra - Single-cell tutorial [4/6]: Single ­cell RNA ­seq data analysis QC...

RNAseq tutorial - part 1 - building STAR genome index

What is RNA - seq? || Part 4 || Data Analysis

GL4U: RNAseq Bootcamp June 2021 Pilot Day4 Part 4 of 4

RNAseq tutorial - part 3 - generating count table

Part 4 #rnaseqdata: Upstream Analysis to identify Master Regulatory Molecule #Nocodingbioinformatics

RNA-Seq Data Analysis Tutorial (04) - Filtering

RNAseq Differential Gene Expression: Volcano Plot

Molecular insight into Gene Expression Using Digital RNAseq: Digital RNAseq Webinar Part 3

4 part 4: Viral biosynthesis and latency

How to perform RNA seq data analysis in Excel? #exceltutorial #transcriptomics

RNA-seq tutorial with DESeq2: Differential gene expression project

IIHG Intro to the UCSC Genome Browser | Part 4 of 5

Hands on RNA-Seq Analysis With Galaxy

Introducing a tool to get valuable RNA-seq insights in hours

RNA Sequencing: Part III - Introduction to Analysis

How to analyze RNA-Seq data? Find differentially expressed genes in your research.

Anna Cuomo & Ximena Ibarra - Single-cell tutorial [4/6]: Single cell RNA seq data analysis QC...