R Tutorial: Introduction to Data Visualization with ggplot2

Показать описание

---

Hi and welcome the first course in DataCamp's data visualization with ggplot2 series!

My name is Rick Scavetta and I'll be the instructor for this series.

I've been training scientists on how to better understand and visualize their data since 2012. I'm very excited to bring my experience to DataCamp.
So what is data viz?

Data visualization is an essential skill for data scientists. It combines statistics and design in meaningful and appropriate ways.

On the one hand, data vis is a form of graphical data analysis, emphasizing accurate representation and interpretation of data.

On the other hand, data vis relies on good design choices, not only to make our plots attractive, but to also aid both the understanding and communication of results.

On top of that, there is an element of creativity, since at it's heart, data vis is a form of visual communication.

It's important to understand the distinction between exploratory and explanatory visualizations.

Exploratory visualizations are easily-generated, data-heavy and intended for a small specialist audience, for example yourself and your colleagues - their primary purpose is graphical data analysis.

Explanatory visualizations are labor-intensive, data-specific and intended for a broader audience, e.g. in publications or presentations - they are part of the communications process.

As a data scientist, it's essential that you can quickly explore data, but you'll also be tasked with explaining your results to stake-holders.

Good design begins with thinking about the audience - and sometimes that just means ourselves.

This data set contains the average brain and body weights of 62 land mammals. To understand the relationship here, the most obvious first step is to make a scatter plot, like this one.

Two mammals, the African and Asian Elephants have both very large brain and body weights, leading to a positive skew on both axes.

Here, applying a linear model is a poor choice since a few extreme values have a large influence.

A log transformation of both variables allows for a better fit.
So, although we began with a rough exploratory plot, that informed us about our data and lead us to a meaningful result.

In the end, we'd probably want a cleaned-up explanatory plot.

Here's a classic example from Francis Anscombe, first published in 1973.
When we imagine a linear model, as presented on this anonymous plot, we imagine that we are describing data that looks something like this. But this same model could be describing a very different set of data such as a parabolic relationship which calls for a different model.

Or data in which an extreme value has a large effect which becomes clear when the outlier is removed. And sometimes

The model may be describing a relationship where in fact there is none at all because some extreme values may be incorrect.

If we relied solely on the numerical output without plotting our data, we'd have missed distinct and interesting underlying trends.

We can see that data viz is rooted in statistics and graphical data analysis, but it's also a creative process that involves some amount of trial and error.

Alright, enough examples, let's get our fingers moving with some exercises.

#DataCamp #RTutorial #DataVisualizationwithggplot2

Рекомендации по теме

R Tutorial: Introduction to Data Visualization with ggplot2

R Programming Tutorial - Learn the Basics of Statistical Computing

R programming for ABSOLUTE beginners

Learn R in 39 minutes

R programming in one hour - a crash course for beginners

Data Analytics Using R | Introduction To Data Analytics | Data Analytics For Beginners | Simplilearn

R Programming For Beginners | Data Science Tutorial | Simplilearn

R programming for beginners – statistic with R (t-test and linear regression) and dplyr and ggplot

R Tutorial For Beginners 2022 | R Programming Full Course In 7 Hours | R Tutorial | Simplilearn

Python Pandas Tutorial 🔥 | Complete Roadmap For Beginners | 2024

Data Analytics For Beginners | Introduction To Data Analytics | Data Analytics Using R | Simplilearn

R Tutorial: Introduction to R

Explore your data using R programming

R Programming for Beginners | Complete Tutorial | R & RStudio

Getting Started with RStudio and R | Part 1

R Programming Tutorial for Beginners (Examples) | Learn Basics | Statistics & Data Science Cours...

RStudio for the Total Beginner

Introduction to Data Science with R

ggplot for plots and graphs. An introduction to data visualization using R programming

R Programming Full Course for 2023 | R Programming For Beginners | R Tutorial | Simplilearn

R Tutorial For Absolute Beginners [2021]

Introduction to R: Getting Started

Data Analytics For Beginners | Introduction To Data Analytics | Data Analytics Using R | Edureka

I started my data analyst career taking these beginner courses

R Programming For Beginners | R Language Tutorial | R Tutorial For Beginners | Edureka