Effect Generalization

\(\DeclareMathOperator*{\argmin}{argmin}\) \(\newcommand{\abs}[1]{\left\vert {#1} \right\vert}\) \(\newcommand{\indep}{\perp\!\!\!\perp}\) \(\newcommand{\var}{\mathrm{Var}}\) \(\newcommand{\ephi}{\varephilon}\) \(\newcommand{\phii}{\varphi}\) \(\newcommand{\tra}{^{\top}}\) \(\newcommand{\sumin}{\sum_{i=1}^n}\) \(\newcommand{\sumiN}{\sum_{i=1}^n}\) \(\newcommand{\norm}[1]{\left\Vert{#1} \right\Vert}\) \(\newcommand\Bigpar[1]{\left( #1 \right )}\) \(\newcommand\Bigbr[1]{\left[ #1 \right ]}\) \(\newcommand\Bigcr[1]{\left\{ #1 \right \}}\) \(\newcommand\SetB[1]{\left\{ #1 \right\}}\) \(\newcommand\Sett[1]{\mathcal{#1}}\) \(\newcommand{\Data}{\mathcal{D}}\) \(\newcommand{\Ubr}[2]{\underbrace{#1}_{\text{#2}}}\) \(\newcommand{\Obr}[2]{ \overbrace{#1}^{\text{#2}}}\) \(\newcommand{\sumiN}{\sum_{i=1}^N}\) \(\newcommand{\dydx}[2]{\frac{\partial #1}{\partial #2}}\) \(\newcommand\Indic[1]{\mathbb{1}_{#1}}\) \(\newcommand{\Realm}[1]{\mathbb{R}^{#1}}\) \(\newcommand{\Exp}[1]{\mathbb{E}\left[#1\right]}\) \(\newcommand{\Expt}[2]{\mathbb{E}_{#1}\left[#2\right]}\) \(\newcommand{\Var}[1]{\mathbb{V}\left[#1\right]}\) \(\newcommand{\Covar}[1]{\text{Cov}\left[#1\right]}\) \(\newcommand{\Prob}[1]{\mathbf{Pr}\left(#1\right)}\) \(\newcommand{\Supp}[1]{\text{Supp}\left[#1\right]}\) \(\newcommand{\doyx}{\Prob{Y \, |\,\mathsf{do} (X = x)}}\) \(\newcommand{\doo}[1]{\Prob{Y |\,\mathsf{do} (#1) }}\) \(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\Z}{\mathbb{Z}}\) \(\newcommand{\wh}[1]{\widehat{#1}}\) \(\newcommand{\wt}[1]{\widetilde{#1}}\) \(\newcommand{\wb}[1]{\overline{#1}}\) \(\newcommand\Ol[1]{\overline{#1}}\) \(\newcommand\Ul[1]{\underline{#1}}\) \(\newcommand\Str[1]{#1^{*}}\) \(\newcommand{\F}{\mathsf{F}}\) \(\newcommand{\ff}{\mathsf{f}}\) \(\newcommand{\Cdf}[1]{\mathbb{F}\left(#1\right)}\) \(\newcommand{\Cdff}[2]{\mathbb{F}_{#1}\left(#2\right)}\) \(\newcommand{\Pdf}[1]{\mathsf{f}\left(#1\right)}\) \(\newcommand{\Pdff}[2]{\mathsf{f}_{#1}\left(#2\right)}\) \(\newcommand{\dd}{\mathsf{d}}\) \(\newcommand\Normal[1]{\mathcal{N} \left( #1 \right )}\) \(\newcommand\Unif[1]{\mathsf{U} \left[ #1 \right ]}\) \(\newcommand\Bern[1]{\mathsf{Bernoulli} \left( #1 \right )}\) \(\newcommand\Binom[1]{\mathsf{Bin} \left( #1 \right )}\) \(\newcommand\Pois[1]{\mathsf{Poi} \left( #1 \right )}\) \(\newcommand\BetaD[1]{\mathsf{Beta} \left( #1 \right )}\) \(\newcommand\Diri[1]{\mathsf{Dir} \left( #1 \right )}\) \(\newcommand\Gdist[1]{\mathsf{Gamma} \left( #1 \right )}\) \(\def\mbf#1{\mathbf{#1}}\) \(\def\mrm#1{\mathrm{#1}}\) \(\def\mbi#1{\boldsymbol{#1}}\) \(\def\ve#1{\mbi{#1}}\) \(\def\vee#1{\mathbf{#1}}\) \(\newcommand{\Mat}[1]{\mathbf{#1}}\) \(\newcommand{\eucN}[1]{\norm{#1}}\) \(\newcommand{\lzero}[1]{\norm{#1}_0}\) \(\newcommand{\lone}[1]{\norm{#1}_1}\) \(\newcommand{\ltwo}[1]{\norm{#1}_2}\) \(\newcommand{\pnorm}[1]{\norm{#1}_p}\)

We simulate some experimental data with a true treatment effect, introduce selection bias using the selF selection function, and attempt to recover the true treatment effect using the observed data.

suppressPackageStartupMessages(library(causalTransportR))
set.seed(42)

# simulate RCT
treatprob <- 0.5
# workhorse
dgp <- function(n = 10000, p = 10, treat.prob = treatprob,
                # bounds of X
                Xbounds = c(-1, 1),
                # nonlinear heterogeneity
                tauF = function(x) 1 / exp(-x[3]),
                # nonlinear y0
                y0F = function(x) pmax(x[1] + x[2], 0) + sin(x[5]) * pmax(x[7], 0.5),
                # nonlinear selection
                selF = function(x) x[1] - 3 * x[3] + pmax(x[4], 0)) {
        X <<- matrix(runif(n * p, Xbounds[1], Xbounds[2]), n, p)
        a <<- rbinom(n, 1, treat.prob)
        # generate outcomes using supplied functions
        TAU <<- apply(X, 1, tauF)
        Y0 <- apply(X, 1, y0F)
        selscore <- apply(X, 1, selF)
        # outcome
        y <<- (a * TAU + Y0 + rnorm(n))
        # selection
        s <<- rbinom(n, 1, plogis(selscore)) |> as.logical()
        # set outcomes for s = 0 as missing
        y[s == 0] <<- NA
        cat("True effect\n")
        cat(mean(TAU))
}
# this function populates the global namespace using <<- ; not recommended for real use
dgp()

## True effect
## 1.173934

We’ve introduced systematic selection bias in the missigness of y. This is plausible in settings where the experimental sample is systematically different from the overall population, or if the missing ‘cohort’ was assigned later and we want to call the experiment earlier based on complete data.

naive

Using the AIPW estimator naively on complete data is ignoring the missingness problem, and gives us the wrong answer (we know the true effect above is 1.1739339).

cat("\n Naive \n")

## 
##  Naive

naive_fit <- ateGT(y[s == 1], a[s == 1], X[s == 1, ],
        nuisMod = "rlm", target = "insample", noi = F
)
naive_fit %>% summary()

##      parameter       est         se     ci.ll     ci.ul pval
## 1      E{Y(0)} 0.3518456 0.02111955 0.3104513 0.3932399    0
## 2      E{Y(1)} 1.2192154 0.02207557 1.1759473 1.2624836    0
## 3 E{Y(1)-Y(0)} 0.8673699 0.02864512 0.8112254 0.9235143    0

corrected estimators

Next we demonstrate the use of each of the estimators implemented in ateGT on the simulated data above.

AISW

cat("\nAISW \n")

## 
## AISW

aipwfit <- ateGT(y, a, X, s,
        treatProb = treatprob, nuisMod = "rlm",
        estimator = "AISW", target = "generalize", noi = F
)
aipwfit %>% summary

##      parameter       est         se     ci.ll     ci.ul pval
## 1      E{Y(0)} 0.2790733 0.03104175 0.2182314 0.3399151    0
## 2      E{Y(1)} 1.5089650 0.02994849 1.4502659 1.5676640    0
## 3 E{Y(1)-Y(0)} 1.2298917 0.04250149 1.1465888 1.3131946    0

As with any reweighting estimator, it is a good idea to study balance. The plot method for the ateGT function produces the following plot, which suggests substantial improvements in balance thanks to selection weights.

aipwfit %>% plot(X)

cat("\nISW \n")

## 
## ISW

ipwfit <- ateGT(y, a, X, s,
        treatProb = treatprob, nuisMod = "rlm",
        estimator = "ISW", target = "generalize", noi = F
)

## Warning in ateGT(y, a, X, s, treatProb = treatprob, nuisMod = "rlm", estimator = "ISW", : Analytic SEs may be inacurrate; use the bootstrap

ipwfit %>% summary

##      parameter       est         se     ci.ll     ci.ul pval
## 1      E{Y(0)} 0.2816738 0.03347853 0.2160558 0.3472917    0
## 2      E{Y(1)} 1.4553181 0.06064194 1.3364599 1.5741763    0
## 3 E{Y(1)-Y(0)} 1.1736443 0.06985879 1.0367211 1.3105675    0

cat("\nOM \n")

## 
## OM

omfit <- ateGT(y, a, X, s,
        treatProb = treatprob, nuisMod = "rlm",
        estimator = "OM", target = "generalize", noi = F
)

## Warning in ateGT(y, a, X, s, treatProb = treatprob, nuisMod = "rlm", estimator = "OM", : Analytic SEs may be inacurrate; use the bootstrap

omfit %>% summary

##      parameter       est          se     ci.ll     ci.ul pval
## 1      E{Y(0)} 0.3034576 0.005016487 0.2936253 0.3132899    0
## 2      E{Y(1)} 1.4439125 0.007368723 1.4294698 1.4583552    0
## 3 E{Y(1)-Y(0)} 1.1404549 0.005082201 1.1304938 1.1504160    0

cat("\nACW \n")

## 
## ACW

acwfit <- ateGT(y, a, polySieveM(X), s,
        treatProb = treatprob, nuisMod = "rlm",
        estimator = "ACW", target = "generalize", noi = F
)

## Warning in ateGT(y, a, polySieveM(X), s, treatProb = treatprob, nuisMod = "rlm", : Analytic SEs may be inacurrate; use the bootstrap

acwfit %>% summary

##      parameter      est         se     ci.ll     ci.ul pval
## 1      E{Y(0)} 0.293218 0.02009422 0.2538333 0.3326026    0
## 2      E{Y(1)} 1.490452 0.01966189 1.4519143 1.5289889    0
## 3 E{Y(1)-Y(0)} 1.197234 0.02699272 1.1443279 1.2501394    0

We do better than the naive approach in all these cases.

Apoorva Lal

2022-10-23

naive

corrected estimators

AISW