Skip to contents

Augmented IPW, generalization, or transport estimator with ML and cross-fitting for nuisance functions. Imputes counterfactual outcome for each observation i under each treament a as $$Y^a = \omega \frac{A = a}{\pi^a (X)} (Y - \mu^a(X)) + \mu^a(X) $$ Where \(\omega\) is 1 for all observations under no sample selection, and therefore this is the doubly-robust Augmented Inverse Propensity Weighting (AIPW) estimator. When S is supplied, the argument in 'target' is used to fit either the generalization or transportation estimator, which corresponds with \(\omega = S/\rho(X)\) and \(\omega = (S (1-\rho(X)) /\rho(X)\) respectively. When a surrogate vector \(Z\) is supplied, an additional residual piece \(a /\pi(X)(\hat{\nu}(X, Z) - \hat{\mu}(X))\) is added to the influence function. Average treatment effects are defined as averages of differences between counterfactual outcomes \(Y^a - Y^{a'}\).

Usage

ateGT(
  y,
  a,
  X,
  s = NULL,
  treatProb = NULL,
  Z = NULL,
  nuisMod = c("rlm", "rf"),
  target = c("generalize", "transport", "insample"),
  estimator = c("AISW", "ISW", "OM", "CW", "ACW"),
  hajekize = FALSE,
  separateMus = TRUE,
  glmnet_lamchoice = "lambda.min",
  glmnet_alpha = 1,
  glmnet_rho_family = "binomial",
  glmnet_pi_family = "binomial",
  glmnet_mu_family = "gaussian",
  glmnet_parl = FALSE,
  grf_tuneRf = "none",
  noi = FALSE
)

Arguments

y

outcome vector (may contain missings ; missings must correspond with s = 0)

a

treatment vector (no missings; can be relaxed with some tinkering)

X

covariate matrix (no missings)

s

selection vector, NULL by default (no missings, 1 corresponds with nonmissing y; 0 corresponds with missing y). May be omitted when the target is "insample" .

treatProb

propensity score vector (of length n_treatment) or matrix (n_treatment X n_obs), where latter is for covariate adaptive designs; must sum to 1. NULL by default, so pscore is fitted. When provided, no propensity score is fit. With discrete covariates, estimated propensity score is advisable even if treatment was randomized.

Z

surrogate matrix, NULL by default (no missings). When nonmissing, the surrogate influence function (Kallus and Mao 2020) is used to compute treatment effects.

nuisMod

one of c("rlm", "rf") : choose how to fit nuisance functions (cross-fit).

target

one of c("generalize", "transport", "insample") estimand to target. "generalize" generalizes (quasi)experimental estimates from the complete data (S == 1) to the overall sample (S == 0 or S == 1). "transport" transports estimates from the S == 1 sample to the S == 0 sample. "insample" estimates causal effects in the S == 1 sample (i.e. conventional quasi/experimental estimation).

estimator

one of c("AISW", "ISW", "OM", "CW", "ACW"). The default is the augmented inverse selection weighting estimator, which augments the inverse selection weighting estimator (ISW) with an outcome model (OM). ACW does the same with calibration weights (CW), which fit a set of entropy balancing weights that reweights the sample to match target sample moments.

hajekize

boolean for whether to divide the inverse probability weights term for each treatment level by the sum of weights in that treatment level. This guards against instability from very large weights from extremely small selection or propensity scores.

separateMus

boolean for whether to fit separate outcome models for each treatment group or a single pooled model. The former is recommended and is the default, but a pooled model may be fit when data is scarce / computation is burdensome.

glmnet_lamchoice

choice of lambda (shrinkage parameter) for regularized linear regressions. Only relevant when nuisMod == "rlm"

glmnet_alpha

in [0, 1], choice of alpha in glmnet. 1 (default) corresponds with L1 regularization (LASSO) and 0 corresponds with L2 regularization (ridge), while intermediate values correspond with a mix of the two (elastic net)

glmnet_rho_family

GLM family for selection model. "binomial" by default but can be safely switched to "gaussian" for linear probability models with discrete covariates for faster compute

glmnet_pi_family

GLM family for propensity model. "binomial" by default but can be safely switched to "gaussian" for linear probability models with discrete covariates for faster compute

glmnet_mu_family

GLM family for outcome model. Gaussian by default.

glmnet_parl

Boolean for parallelization in glmnet. Need to enable parallelized cluster beforehand.

grf_tuneRf

Tune rf hyperparameters? Passed to grf's regression forest. Use 'all' for hyperparameter tuning.

noi

boolean for printing marginal means and causal contrasts table (it gets returned anyway). Off by default.

Value

list containing treatment effects table , nuisance function estimates, and influence function values

References

Bia, M., M. Huber, and L. Lafférs. (2020): “Double Machine Learning for Sample Selection Models,” arXiv [econ.EM],.

Dahabreh, I. J., S. E. Robertson, E. J. Tchetgen, E. A. Stuart, and M. A. Hernán. (2019): “Generalizing causal inferences from individuals in randomized trials to all trial-eligible individuals,” Biometrics, 75, 685–94.

Hirshberg, D. A., A. Maleki, and J. R. Zubizarreta. (2019): “Minimax Linear Estimation of the Retargeted Mean,” arXiv [math.ST],.

Kallus, N., and X. Mao. (2020): “On the Role of Surrogates in the Efficient Estimation of Treatment Effects with Limited Outcome Data,” arXiv [stat.ML],.

Examples