Explaining Anomalies with Isolation Forest and SHAP | Python Tutorial

preview_player
Показать описание
In this video, we dive deep into the world of anomaly detection with a focus on the Isolation Forest algorithm. Isolation Forest is a powerful machine learning model for identifying outliers in high-dimensional data, but understanding why an anomaly is detected can be a challenge. That's where SHAP (SHapley Additive exPlanations) comes in.

We'll explore how to use both KernelSHAP and TreeSHAP to interpret the contributions of individual features to anomaly scores. You'll learn how to visualize and break down these contributions, making it easier to understand and explain the decisions made by Isolation Forest. This is particularly valuable in real-world applications like fraud detection, where knowing the 'why' behind an anomaly is just as important as identifying it.

🚀 SHAP Course 🚀
The first 20 people to use the coupon code "IFSHAP24" will get 100% off!

🚀 Free XAI Courses 🚀

🚀 Companion article with link to code (no-paywall link): 🚀

🚀 Learn about Isolation Forests 🚀

🚀 Useful playlists 🚀

🚀 Get in touch 🚀

🚀 Sections 🚀
00:00 Introduction
01:35 What is Anomaly Detection?
02:28 What is Isolation Forest?
05:57 Interpreting SHAP Values for Isolation Forest
07:44 Model Training
15:28 KernelSHAP with Anomaly Score
21:17 TreeSHAP with Average Path Length
Рекомендации по теме
Комментарии
Автор

great explanation, You are the best !

duongkstn
Автор

Thanks for the detailed video. It is really helpful. Can I get the code base which was used in your demo?

damodarperumalla
Автор

Hey great explanation! I have a question: Say I have time series of how many items i sold over 3 years for different items. The items can be sold in multiple stores across the world. My task is to detect an anomaly on the item level (not on the aggregate level). Do I run this isolation forest on each invidual time series and add the store (as a one hot encoded variable) to the feature matrix? Running it individually for each item seems to lose potential information that can be extracted when looking at global patterns across different items. What would you advise in this case? It seems to be a hierarchical time series anomaly detection problem

majamuster
Автор

Is it possible to get link for this notebook ?

ravitejaneravati
Автор

why does Shap waterfall show different tags names every time we execute for the same shap_value.

kidsslockdownhobbies