Using the vegan R package to generate ecological distances (CC188)

Показать описание

The vegan R package has a powerful set of functions for calcuating the ecological distance between communities. In this episode, Pat shares how to get your data in the right format to use vegdist and avgdist prior to analyzing the distances using NMDS. He discusses using rarefaction with avgdist to control for uneven sampling effort since the Bray-Curtis dissimilarity index is sensitive to uneven sampling effort. We'll use the metaMDS function from vegan and tools from ggplot2 and the tidyverse packages.

#vegdist #avgdist #vegan #ggplot2 #R #Rstudio #Rstats

You can also find complete tutorials for learning R with the tidyverse using...

0:00 Calculating ecological distances with the vegan R package
2:08 Preparing matrix of sample by taxa counts
10:23 Calculating distances using vegdist
12:27 Using community matrix directly in metaMDS
13:18 Rarefying distance calculations using avgdist

Рекомендации по теме

Комментарии

Great channel! I've been trying for months to learn some of these techniques from scattered sources and you're really helping me make sense of the mess of lessons I've tried to wrap my head around.

overcup

Thank you so much for the great channel! 💙💙. your videos is super helpful... simply it is awesome😃

hebaahmed-tqqf

thanks for showing vegan <3 I love this channel
Learning every second day with you 100% guaranteed

I learned yesterday this trick :
df %>%
mutate(day = str_replace(Group, ".D*, "", .before =2)

which will put the mutated column to a designated position, in the example above on position 2, just in front of the "old" column 2
so you dont need these select(1, 2, everything() ) lines anymore

svenr

Very informative!! Thank you!! I usually attribute name to the object in the last dplyr function as " %>% as.data.frame(.) -> new_object" but I know it is little weird :)

igordemetriusalencar

I am so glad I found this video.. <3

unavaliableavaliable

For an alternative to the usual rarefaction method, take a look at the SRS function in the SRS package. 1. Beule L, Karlovsky P. Improved normalization of species count data in ecology by scaling with ranked subsampling (SRS): application to microbial communities. PeerJ. 2020;8:e9593.

johnquensen

This is a great overview of using vegan for calculating distances and plotting them. Some nice additions (if you don't already have planned) would be to show how to pull out which variables (or species) are driving the spread on the plot and adding that data to the plot. You mentioned that the different clouds pertained to different days, so I'm assuming you're going to discuss that in another video.

samprice

Just a note: we handle data frames of abundance data just fine in vegan's community ecology functions, including `vegdist()`. The only restriction is that you have to get rid of meta data (the `Group` column in Pat's data) from the data frame just like Pat showed in the video. You just don't need to do the last step of converting to a matrix.

ftboth

Hi Pat, thank you so much for your videos! They are always very complete and didactic.

I would like to ask a question, is it possible to calculate the Bray-Curtis similarity and then build a dendrogram using ggplot2? Could you make a video on how to build a BC similarity dendrogram?

viniciusestrella

Great explanation!! It will be awesome if you can reduce the talk speed a bit though...

samadhigunathunga

Hi Pat! This was super helpful. I've performed rarefaction on my data using rrarefy in vegan and looked at alpha diversity of particular samples, but I still want to calculate the distance between some samples. Should I run avgdist on my original data to calculate the distance between ALL samples, then run metaMDS on just the samples I'm interested in? Or should I run avgdist on just the samples I'm interested in? Also, is it improper that I would rarify using rrarefy to look at alpha diversity then rarify again to look at beta diversity? Should I be using the same rarified data for both analyses?! Sorry for all the questions! I'm new to microbiome analysis

bridget

Hi Pat, thank you for sharing! When analyzing for group differences in distances, do you always test for dispersion effects afterwards? will there be a video about this in the future?

Rydaholic

Very helpful demo.. just wanted to clarify something. Why did you take sample=1800 at 14:36??

vikashiremath

Hey Pat great video and thanks for all your work on this channel. I am having an issue once I arrive at the `scores( nmds )` line. I get an error that states the following: "Error in x$species[, choices, drop = FALSE] :
incorrect number of dimensions". Have you or anybody else encountered this?

chrismaino

Hi Pat! Thanks so much for the videos, I've just recently discovered your channel and it's been incredibly helpful for my learning process.

I'm wondering if you could clarify the need to calculate distance matrix before running NMDS? I have a species assemblage dataset from an underwater visual census (UVC). My data has a ton of zeroes and just like yours, a lot of columns (species). I've ran both NMDS without calculating the vegdist (+ automatic transformations) and with vegdist. They look similar but not the same. Thus I'm not sure which one to use for my publication. Why would you advice me against using the plot without prior calculation of distance matrix?

Also, seems like my data has a high stress (>0.2) when run with k=2. If I run it with k=3, should I be presenting the figure in 3D?

Thanks in advance!

Rinaldigotama

You really need to put `+ coord_equal()` or +`coord_fixed()` on your ordination diagrams created by hand. The Euclidean distance on the plot is some approximation to some other distance (in NMDS the rank order of the euclidean distance on the plot is intended to be a close approximation of the original distances between samples) and if you don't keep a fixed aspect ratio this visual distance interpretation is broken

ftboth

Dear professor Pat, I was just wondering if I can use a presence/absence data set for avgdist(). Wouldn't that be inappropriate as rarefaction is based on abundance data, not presence/absence?

wenyizhou

Hi Pat, thanks for the nice vedio! when use nmds <- metaMDS(shared, autotransform = FALSE), then score(nmds), the output has both $sites (which is the Group here) and $species (OTUs). I cannot directly pipe it to ggplot. I wonder how you deal with it? Thanks!

guani

How can I build a dendrogram with bray curtis dissimilarity in R?

dr.ozgekahramanilkkan

Using the vegan R package to generate ecological distances (CC188)

Vegan R Package Tutorial

Using the vegan R package to generate ecological distances (CC188)

Tutorial on using adonis from the vegan R package (CC081)

How to rarefy community data in R with vegan and the tidyverse (CC200)

Using the mantel test to compare ecological matrices using the vegan R package (CC211)

Using vegan to calculate alpha diversity metrics within the tidyverse in R (CC196)

Introduction to multivariate data analysis using vegan

vegan package in r

PCA using vegan and prcomp in R (Part 1) | Nutribiomes

'vegan' Package Lecture

Using adonis and betadisper from the vegan R package to compare groups (CC208)

Mantel test in R |Vegan package

Running non-metric multidimensional scaling (NMDS) in R with vegan and ggplot2 (CC187)

Vegan Package

How to create a biplot using vegan and ggplot2 (CC203)

Rarefaction controls the false positive rate when using adonis from the vegan R package (CC193)

Partial mantel test in R| vegan Package

Advanced community ecological data analysis using vegan

PCA using vegan and prcomp in R (Part 2) | Nutribiomes

R : r2 results from envfit in vegan package

R : Plotting ordiellipse function from vegan package onto NMDS plot created in ggplot2

R : Plotting envfit vectors (vegan package) in ggplot2

How to Install Packages in R Studio and Handling Installation Errors

Package Vegan in R by JSMC