Causal Effects via Regression w/ Python Code

preview_player
Показать описание
This is the 5th video in a series on causal effects. In the previous videos, we discussed different ways to compute treatment effects from data.

More resources:

--

Intro - 0:00
What is regression? - 0:25
3 Regression-based Techniques - 2:26
1) Linear Regression - 2:47
2) Double Machine Learning - 5:26
3) Metalearners - 9:02
3.1) T-learner - 9:29
3.2) S-learner - 11:24
3.3) X-learner - 12:56
Example Code - 15:12
Рекомендации по теме
Комментарии
Автор

More on Causal Effects 👇

Some resources I found helpful:

ShawhinTalebi
Автор

Love all this info. Great resources and superb explanations. :)

ketalesto
Автор

Thanks for superb explanations. I have two questions on my understanding by assuming a different DAG, can you comment?
1: In your notebook with linear regression, double machine learning and x-learner, as all with method_name="backdoor.****", can I assume that it won't capture any other variable in the DAG, like W, whose interaction with X has impact on the outcome,
i.e. the outcome Y is collider from treatment X and this independent variable W, meanwhile, there is no arrow between X and W
in the DAG as there is no causal relation between the two. i.e. this causal analysis will never capture interaction impacts with X on Y. If that is the case, how we trust or interpret the resulting ACT, as it only reflect partial picture.

2: For linear regression and double machine learning, as both specified either with or 'model_y':LinearRegression(), can I assume that it won't capture quadratic term for X, so the fundamental models used are not the best fit for X and Y. If that is the case, how we trust or interpret the resulting ACT, as the backend models are the best fit.

donlee
Автор

Hi Shawhin, thank you for this amazing playlist! I have a question:

1. Before we get to estimation, we evaluate whether the causal effect is identifiable or not (through identify_effect()). This step would give us a sufficient set of variables we need to observe in order to compute our causal effect. I believe this is captured in the estimand

2. So, are these regression models built on only these variables from the sufficient set? Or do these models use all the variables provided? We pass in the estimand to model.estimate_effect(), so I'm wondering how do we connect the estimand to the estimation step

It would be super helpful if you can throw some light on what happens behind the scenes here, and any mistakes in my understanding from above. Thanks!

sateeshsivakoti
Автор

Hi, many thanks for this video, I hope when you can do a video about 2ML ..

tariqahassan
Автор

Another question:
In the S-learner you comment that we can use a multilevel variable.
But how would the ITE and ATE be calculated in that case?
Do we need to calculate ITE and ATE for each pair of values in the treatment variable?

ketalesto
Автор

When we use regression for inference, how can we evaluate the performance? Even a model with poor R2 provides valid conclusions around the causal inference?

roopalilalwani
Автор

So basically for the T-learners we use one model to fit the untreated data with the target variable and another one to fit the treated data with the target variable (with both models excluding the treatment variable itself) and then use the treated model to predict the treatment outcome value and the untreated model to predict the control outcome value for all rows in the dataset? But that means we used both models to predict values on (at least some of) the data that it's trained on right? Is this really a good practice?

-o-
Автор

One question:
Before estimating an effect, you have to construct the DAG. Isn't it?
Which variables/nodes from the DAG are the covariates (Z)? Or you just include every other variable available?

ketalesto
Автор

Double Machine Learning sounds like something you say as a joke to someone when theyre using machine learning

King_Konglish