Omnibus function for ATE estimation for generalization and transportation

Augmented IPW, generalization, or transport estimator with ML and cross-fitting for nuisance functions. Imputes counterfactual outcome for each observation i under each treament a as $$Y^a = \omega \frac{A = a}{\pi^a (X)} (Y - \mu^a(X)) + \mu^a(X) $$ Where $\omega$ is 1 for all observations under no sample selection, and therefore this is the doubly-robust Augmented Inverse Propensity Weighting (AIPW) estimator. When S is supplied, the argument in 'target' is used to fit either the generalization or transportation estimator, which corresponds with $\omega = S/\rho(X)$ and $\omega = (S (1-\rho(X)) /\rho(X)$ respectively. When a surrogate vector $Z$ is supplied, an additional residual piece $a /\pi(X)(\hat{\nu}(X, Z) - \hat{\mu}(X))$ is added to the influence function. Average treatment effects are defined as averages of differences between counterfactual outcomes $Y^a - Y^{a'}$.

Usage

ateGT(
  y,
  a,
  X,
  s = NULL,
  treatProb = NULL,
  Z = NULL,
  nuisMod = c("rlm", "rf"),
  target = c("generalize", "transport", "insample"),
  estimator = c("AISW", "ISW", "OM", "CW", "ACW"),
  hajekize = FALSE,
  separateMus = TRUE,
  glmnet_lamchoice = "lambda.min",
  glmnet_alpha = 1,
  glmnet_rho_family = "binomial",
  glmnet_pi_family = "binomial",
  glmnet_mu_family = "gaussian",
  glmnet_parl = FALSE,
  grf_tuneRf = "none",
  noi = FALSE
)

Arguments

y: outcome vector (may contain missings ; missings must correspond with s = 0)
a: treatment vector (no missings; can be relaxed with some tinkering)
X: covariate matrix (no missings)
s: selection vector, NULL by default (no missings, 1 corresponds with nonmissing y; 0 corresponds with missing y). May be omitted when the target is "insample" .
treatProb: propensity score vector (of length n_treatment) or matrix (n_treatment X n_obs), where latter is for covariate adaptive designs; must sum to 1. NULL by default, so pscore is fitted. When provided, no propensity score is fit. With discrete covariates, estimated propensity score is advisable even if treatment was randomized.
Z: surrogate matrix, NULL by default (no missings). When nonmissing, the surrogate influence function (Kallus and Mao 2020) is used to compute treatment effects.
nuisMod: one of c("rlm", "rf") : choose how to fit nuisance functions (cross-fit).
target: one of c("generalize", "transport", "insample") estimand to target. "generalize" generalizes (quasi)experimental estimates from the complete data (S == 1) to the overall sample (S == 0 or S == 1). "transport" transports estimates from the S == 1 sample to the S == 0 sample. "insample" estimates causal effects in the S == 1 sample (i.e. conventional quasi/experimental estimation).
estimator: one of c("AISW", "ISW", "OM", "CW", "ACW"). The default is the augmented inverse selection weighting estimator, which augments the inverse selection weighting estimator (ISW) with an outcome model (OM). ACW does the same with calibration weights (CW), which fit a set of entropy balancing weights that reweights the sample to match target sample moments.
hajekize: boolean for whether to divide the inverse probability weights term for each treatment level by the sum of weights in that treatment level. This guards against instability from very large weights from extremely small selection or propensity scores.
separateMus: boolean for whether to fit separate outcome models for each treatment group or a single pooled model. The former is recommended and is the default, but a pooled model may be fit when data is scarce / computation is burdensome.
glmnet_lamchoice: choice of lambda (shrinkage parameter) for regularized linear regressions. Only relevant when nuisMod == "rlm"
glmnet_alpha: in [0, 1], choice of alpha in glmnet. 1 (default) corresponds with L1 regularization (LASSO) and 0 corresponds with L2 regularization (ridge), while intermediate values correspond with a mix of the two (elastic net)
glmnet_rho_family: GLM family for selection model. "binomial" by default but can be safely switched to "gaussian" for linear probability models with discrete covariates for faster compute
glmnet_pi_family: GLM family for propensity model. "binomial" by default but can be safely switched to "gaussian" for linear probability models with discrete covariates for faster compute
glmnet_mu_family: GLM family for outcome model. Gaussian by default.
glmnet_parl: Boolean for parallelization in glmnet. Need to enable parallelized cluster beforehand.
grf_tuneRf: Tune rf hyperparameters? Passed to grf's regression forest. Use 'all' for hyperparameter tuning.
noi: boolean for printing marginal means and causal contrasts table (it gets returned anyway). Off by default.

Value

list containing treatment effects table , nuisance function estimates, and influence function values

References

Bia, M., M. Huber, and L. Lafférs. (2020): “Double Machine Learning for Sample Selection Models,” arXiv [econ.EM],.

Dahabreh, I. J., S. E. Robertson, E. J. Tchetgen, E. A. Stuart, and M. A. Hernán. (2019): “Generalizing causal inferences from individuals in randomized trials to all trial-eligible individuals,” Biometrics, 75, 685–94.

Hirshberg, D. A., A. Maleki, and J. R. Zubizarreta. (2019): “Minimax Linear Estimation of the Retargeted Mean,” arXiv [math.ST],.

Kallus, N., and X. Mao. (2020): “On the Role of Surrogates in the Efficient Estimation of Treatment Effects with Limited Outcome Data,” arXiv [stat.ML],.

Omnibus function for ATE estimation for generalization and transportation

Usage

Arguments

Value

References

Examples