Full Python Tutorial: Customer Lifetime Value & RFM Analysis using Machine Learning

preview_player
Показать описание
This is a full python tutorial where we analyze customer purchase behavior to predict their purchases over the next 90-days. This allows us to target customers to prevent churn and increase profitability. We use:

- Pandas to create Recency-Frequency-Monetary (RFM) Features
- Scikit-Learn XGBoost to create 2 predictive models (90-day spend amount and spend probability)
- Plotly Dash to productionalize a solution for Marketing Teams

WANT THE CODE?

Table of Contents
00:00 Customer Lifetime Value with Machine Learning
01:06 Project Workflow: Pandas, Scikit Learn, Plotly Dash
04:47 Plotly Dash Demo: Customer Spend Prediction App
08:08 Business Problem: Non-Contractual Purchase Relationship | CDNOW Customer Transactions
10:47 The 3 Questions We'll Answer Today
14:07 Customer Lifetime Value Modeling: Econometric Approach (Cashflow)
16:44 Customer Lifetime Value: Machine Learning Approach
20:47 Full Code Tutorial | Customer Lifetime Value with ML in Python
20:57 Project Setup: VSCode
25:44 Customer Lifetime Value Analysis
27:46 Data Preparation
30:08 Cohort Analysis
36:53 Machine Learning Feature Engineering
37:55 Time Splitting
40:03 Feature Engineering (RFM)
41:29 Recency (R) Features
44:29 Frequency (F) Features
45:31 Monetary (M) Features
46:21 Combine RFM Features
48:51 Machine Learning CLV Analysis: Spend Amount Model
54:09 Machine Learning CLV Analysis: Spend Probability Model
57:28 Feature Importance: Spend Amount Model
59:21 Feature Importance: Spend Probability Model
1:00:52 Save Work
1:01:42 Dash App
1:04:26 How Can We Use this Information? (3 Questions)
1:06:51 Next Steps: Learning More
1:08:50 Dash App Files
1:10:00 Python Track Roadmap
1:22:12 Q&A
Рекомендации по теме
Комментарии
Автор

I love R. But at my new Job i "have to" work with python and I'm quite happy that your courses are using python as well. Really nicely done.

Scootenfruity
Автор

43:53, for the monetary value, can someone explain why we use price alone, instead of revenue (price * quantity) for each customer? Am i missing something

puppyfindchloe
Автор

54:00 so what you did was you predicted on the data we already knew. What about the true future, fitting model on all known data and predicting next unkown 90 days ? BG/NBD and gamma-gamma models can do that.

RidingWithGerdas
Автор

Ad a data scientist whose been working on customer analytics at startups - I feel like I've just discovered a gold mine

brothermalcolm
Автор

This is very insightful and rich. how can i apply this to determine CLV for retail customers of a bank?

AiykRichie
Автор

Hi! I think there is an issue with the logic. Correct me if Im wrong. You are splitting the data in most recent and oldest, and then you are training a model using the same old and new data. Your y in the model is what you are trying to predict later (90 most recent days). I dont see the point of creating the model here.

nicolascortinas
Автор

thank you for such a nice usecase of CLV....

dsdjiitian
Автор

53:16 actually deviation by $10 is pretty bad.
for 25%tile it's off by 50%
for 50%tile it is off by 25%
for 75%tile it is off by 10%

PratapOO
Автор

Hi Matt, I'm planning to add this to my portfolio. Is it possible to follow along with jupyter notebook + pip? or having the same setup as you is crucially important?

travelsizearchitect
Автор

Hi, I think I’m a bit confused about data splitting and CV process.
When you build the model, you’re using “unseen” data as input, and try to predict again on seen and unseen data.
From the business perspective and algorithmic way, isn’t it supposed be tested on the only targets df?

cevikyi
Автор

Hi! Thanks for these amazing videos! I just started in Marketing Analysis and these labs help me a lot to understand the calculations and possibilities!
One question: why in the minute 31:23 you sum the 'price' column? Should not be price * quantity?

Btw, the labs pro include all the labs?

darnelb
Автор

the video is good, but it really lack proper cross validation and over-fitting handling

LostMakaveli
Автор

Is the github repo available for this project ?

sohambasu
Автор

Thank you very much. this is very useful. I was wondering if we can use XAI to explain the xgboost model?

mishralucky
Автор

Amazing explanation. Definitely gonna subscribe pro service. What is your opinion about lifetimes library? How reliable is the CLV calculation made by lifetimes lib using gamma-gamma and BG/NBD model?

Musaibaziz
Автор

May I know if there is tutorial videos for CLV modeling using R?

chenjxing
Автор

Thanks for the tutorial but I think there is a problem with the logic. You trained the model using seen target values. Then you tried to predict that target values. You should have predicted the following 90 day period.

senolkurt
Автор

can you add shapley model for interpretability ?

gauravmodi
Автор

Thank you ! How can I predict LTV for new customers using this technique?

mamadoucamara
Автор

What is quantity? is the numbers of invoice at the same date? or is it the quantity of products of the same purchase?

desarrolloroghur