R Tutorial: Proportions of a single population

preview_player
Показать описание

---

Now that we're familiar with our dataset, let's explore a common type of data: the proportion. Specifically, the proportions of a single population.

Whenever you have some data that add up to a whole you have a proportion.

Think about the surface of the earth. It is roughly 70% water and 30% land. When you add up all the land and water together, you get the whole of the planet's surface.

This may seem like a trivial concept but some of the most common mistakes in visualization come from misusing proportion visualization techniques on non-proportion data.

Now that we've clearly delineated what a proportion is, let's cover some of those techniques to visualize them!

The Pie chart is often the first visualization people learn about. They are an intuitive way of representing proportions of wholes. Also, they gave teachers an excuse to bring in pie to class.

Perhaps due to being so well known, pie charts have been abused. Examples with too many slices, parts that don't add to a whole, and even baffling use of 3D have given pie charts a bad rap.

I believe that pie charts have a place in your visualization toolkit, as long as they are used appropriately. But first, let's take a look at their flaws.

A major fault of pie charts is that they encode data in angles. If a class represents 25% of a whole this is encoded by an angle of 45 degrees. Humans like comparing lengths and sizes so this can be confusing.

Another issue is displaying a lot of classes. The first slice has a meaningful anchoring point, usually the vertical line on the top of the pie, but after one slice the following slices start at arbitrary positions. Not only do viewers have to contend with angles, but offset angles. More than 3 slices and things get dicey.

While the complaints against pie charts are largely fair, they do have their pros! As mentioned before they are very intuitive to people as they convey the concept of slices of a whole very well.

In addition, they are compact. This makes them easy to fit in crowded figures or reports.

In this pie chart of observations by region, it is easy to see that a bit more than half come from the African and European regions. The precise proportions might not be clear, but the general picture is painted well.

Say you need the precision that a pie chart can't give you, or you have more than three categories you want to show, or perhaps you simply have enough space and want to try something different.

In these scenarios, a waffle chart may be appropriate. A waffle chart is similar to a pie chart but instead of encoding the proportions in angles, it encodes them in the area by use of small squares, often equaling a percent each.

The squares afford high precision readings because they can be easily counted. If you need more precision, you can just add more squares.

The lack-of-baseline problem that plagues pie-charts for more than three classes is also solved as no baseline is needed to count squares. Obviously, we should still be careful not to try and fit too many classes in our waffle charts for the sake of not making the visualization cluttered.

Here we can easily see that the southeast Asian region accounts for roughly 6% of our observations.

Now that we've discussed the theory of plotting proportions. Let's make some actual plots with our WHO data!

#DataCamp #RTutorial #VisualizationBestPracticesinR #DataVisualization BestPracticesinR #Proportionsofasinglepopulation
Рекомендации по теме