filmov
tv
Python Tutorial: Cohort analysis visualization
Показать описание
---
Welcome to the final lesson on cohort analysis. In this section, we will learn how to build powerful cohort analysis visualizations.
The most effective way to visualize and analyze cohort analysis data is through a heatmap. It provides both the actual metric values and the color-coding to see the differences in the numbers visually.
One of the best things about heatmaps is that they are very easy to build with Python's seaborn package.
First, we will load the retention table in a pivot format which we built in the previous lesson.
It has the same format of data, with the lower right triangle filled with Not-A-Number values. This is expected as the more recent cohorts had less time to be active. You will see in the next page, that having NaN values helps to produce a clean heatmap.
Let's get down to visualizing the retention rates as a heatmap.
The first step is to load the packages seaborn and pyplot.
Next, we create an empty figure with pre-defined width and height in inches. We can customize this view depending on the format of the data.
Then, we add a title to the table.
And then we call the seaborn heatmap function. We pass the retention table to the data parameter, ensure the numbers are also printed by passing True value to the annotation argument. Then define the format as percentage with one decimal value. The vmin and vmax parameters are used to anchor the colormap and make sure the outliers don't impact the visualization. Finally, we pass the `Green` palette to the color map. You can find multiple other color palettes in seaborn documentation.
We can now run the plot show function to bring the heatmap to life.
Here we go, the retention rate heatmap that we've seen through this chapter is finally built with just a few lines of code. This visualization can be used as a standalone representation of the company's retention rate, or as an analytical tool to get insights.
Great job! Now it's your time to visualize customer cohorts metrics in a heatmap format!