14 Overview of doubly robust estimators: Example with the ATE
Recall our motivation for doing mediation analysis — that is, we would like to decompose the total effect of a treatment
Recall that we define the average treatment effect (ATE) as
To introduce some of the ideas that we will use for estimation of the NDE, let us first briefly discuss estimation of
First, notice that under the assumption of no unmeasured confounders (
where the first step adds an expectation over
14.1 Plug-in (G-computation) estimator
The first estimator of
- Fit a regression for
on and , then - use the above regression to predict the outcome mean if everyone’s
is set to , and then - average these predictions.
The resultant estimator can be expressed as
- Note that this plug-in estimator directly uses the above identification formula (called a g-formula, arrived at via g-computation):
- This estimator requires that the (outcome) regression model for
is correctly specified. - Downside: If we use arbitrary machine learning for this model, general theory for computing standard errors and confidence intervals (i.e., statistical inference) is not available.
14.2 Inverse probability weighted (IPW) estimator
An alternative method of estimation can be constructed after noticing the following equivalence:
which may be carried out by way of the following procedure:
- Fit a regression for
and , then - use the above regression to predict the probability of treatment
, then - compute the inverse probability weights
. This weight will be zero for untreated units, and the inverse of the probability of treatment for treated units. - Finally, compute the weighted average of the outcome:
- This estimator requires that the regression model for
is correctly specified. - Downside: If we use arbitrary machine learning for this model, general theory for computing standard errors and confidence intervals (i.e., statistical inference) is not available.
14.3 Augmented inverse probability weighted (AIPW) estimator
Fortunately, we can combine these two estimators to get an estimator with enhanced properties.
The improved estimator can be seen both as a corrected (or augmented) IPW estimator:
or
This estimator has some desirable properties:
- It is robust to misspecification of at most one of the two models (outcome or treatment) (Can you see why?)
- It is distributed as a normal random variable (RV) as sample size grows. This allows us to easily compute confidence intervals and perform hypothesis tests.
- It allows us to use machine learning to estimate the treatment and outcome regressions to alleviate model misspecification bias.
Next, we will work towards constructing estimators with these same properties for the mediation parameters that we have introduced.