Shapley Values : Data Science Concepts

preview_player
Показать описание
Interpret ANY machine learning model using this awesome method!

Рекомендации по теме
Комментарии
Автор

No fancy tools, yet you are so effective!!
You must know that you provide deeper insights that even the standard books do not.

adityanjsg
Автор

Great explanation!! Love how you managed to explain the concept so simply! ❤️

whoopeedoopee
Автор

Thank you, it is because of nice teachers like you, I can learn very much everything from utube

tonywang
Автор

Thank you for the drawing and intuition explanation, which really help me understand Shapley value.

reginaphalange
Автор

I prefer the marker pen style. Here, my complete focus is on the paper in focus and not the surrounding region.

rbpict
Автор

one of the easiest + thorough explanation thank you

niksu
Автор

Hats off to you. Understood most of the explanability techniques

amrittiwary
Автор

Great video. The whiteboard is the better because of all the non-verbal communication: facial expressions, gestures, ...

MatiasRojas-xcol
Автор

This is very clear explanation better than most of the articles that I could find online, thanks! I have one question though: when getting the global shapley value (average across all the instances), why do we sum up the absolute value of the Shapley value of all the instances? Is it how we need to keep the desirable properties of the Shapley value? Is there any meaning of summing up the plain value of the Shapley value (e.g. positive and negative will now cancel off each other)?

Another question is, when you said the expected value of the difference, is it just an arithmetic average of all the difference from all those permutations? I remember seeing something that Shapley value is actually the "weighted" average of the difference, which is related to the ordering of those features. Is the step 1 already taking into this into consideration, such that we only need the arithmetic average to get the final Shapley value for that instance?

oliverlee
Автор

How well do Shapley values align with the composition of various Principal Components? Is there a mathematical relationship between the two, or is it just wholly dependent on the features of the dataset?

koftu
Автор

Wonderful explanation! You explained a very difficult concept simply and concisely! Thanks

kokkoplamo
Автор

Great video, simple and easily comprehensible

Aditya_Pareek
Автор

Thank you for the video, really appreciate it!
I have a question about Step3:
Is it necessary to 'undo' the permutation after creating the Frankenstein Samples and before feeding them in the model, since the model expects Temp to be in the first position from the training?
Thank you very much for clarification

florianhetzel
Автор

Awesome video, I don't have a preference on paper or whiteboard just keep the vids coming! First time I learn about Shapley Values, thank you for that

xxshogunflames
Автор

what a great video! such a simple and effective explanation. Thank you very much for that

djonatandranka
Автор

Thank you for the great explanation but I have one doubt here, how we get 200 there for temperature ? you said it is the expected difference so say when we run the sample 100 time and each time we get some difference so how that 200 number came out from those 100 difference, did we took average or what math's we applied there?
Any response on this would be highly appreciated.

sachinrathi
Автор

Thank you for a very well-explained video on Shapley values :D. It helped me.

lythien
Автор

Hello.
In a linear regression model are SHAP values equivalent to the partial R^2 for a given variable?
Don't they take into account the variance as the p-values do?

juanete
Автор

Thank you for your explanation but with the SHAP library, one only gives the trained model without the training set. How the sampling from the original dataset can be done with only the trained model ?

chakib
Автор

So, if you had x features, say 50, instead of 4, would you randomly subset 15 (half) of them and create x1...x25? And in each of these x1...25, the differences will be that feature 1:i will be conditioned on the random vector whereas feature[i+n] will not be conditioned on the random vector? Trying to visualize what happens when more than 4 features are available.

jacobmoore