crabbymetrics
  • Home
  • API
  • Binding Crash Course
  • Supervised Learning
    • OLS
    • Ridge
    • Fixed Effects OLS
    • ElasticNet
    • Synthetic Control
    • Logit
    • Multinomial Logit
    • Poisson
    • TwoSLS
    • GMM
    • FTRL
    • MEstimator Poisson
  • Semiparametrics
    • Balancing Weights
    • EPLM
    • Average Derivative
    • Double ML And AIPW
    • Richer Regression
  • Unsupervised Learning
    • PCA And Kernel Basis
  • Ablations
    • Variance Estimators
    • Semiparametric Estimator Comparisons
    • Bridging Finite And Superpopulation
  • Optimization
    • Optimizers
    • GMM With Optimizers
  • Ding: First Course
    • Overview And TOC
    • Ch 1 Correlation And Simpson
    • Ch 2 Potential Outcomes
    • Ch 3 CRE And Fisher RT
    • Ch 4 CRE And Neyman
    • Ch 9 Bridging Finite And Superpopulation
    • Ch 11 Propensity Score
    • Ch 12 Double Robust ATE
    • Ch 13 Double Robust ATT
    • Ch 21 Experimental IV
    • Ch 23 Econometric IV

TwoSLS Example

This page shows the linear IV / 2SLS path used by crabbymetrics.TwoSLS, first with one endogenous regressor and then with several.

1 Imports

import numpy as np
from pprint import pprint

from crabbymetrics import TwoSLS

np.set_printoptions(precision=4, suppress=True)

2 Algebra

Write the structural equation as

\[ y = X \beta + u, \]

where

\[ X = \begin{bmatrix} X_{\mathrm{endog}} & X_{\mathrm{exog}} \end{bmatrix}, \qquad Z_{\mathrm{full}} = \begin{bmatrix} \mathbf{1} & X_{\mathrm{exog}} & Z_{\mathrm{excluded}} \end{bmatrix}. \]

The textbook 2SLS estimator is

\[ \hat{\beta}_{\mathrm{2SLS}} = \left(X^\top P_Z X\right)^{-1} X^\top P_Z y, \]

with

\[ P_Z = Z_{\mathrm{full}} \left(Z_{\mathrm{full}}^\top Z_{\mathrm{full}}\right)^{-1} Z_{\mathrm{full}}^\top. \]

That formula is correct, but the implementation does not form the projector explicitly for the point estimate. Instead it uses the equivalent two-stage least-squares solve:

  1. solve the first-stage least-squares problem

    \[ \hat{\Pi} = \arg\min_{\Pi} \left\lVert Z_{\mathrm{full}} \Pi - X_{\mathrm{endog}} \right\rVert_F^2 \]

  2. build fitted endogenous regressors

    \[ \hat{X}_{\mathrm{endog}} = Z_{\mathrm{full}} \hat{\Pi} \]

  3. form the stage-two design

    \[ \hat{X} = \begin{bmatrix} \mathbf{1} & \hat{X}_{\mathrm{endog}} & X_{\mathrm{exog}} \end{bmatrix} \]

  4. solve the second-stage least-squares problem

    \[ \hat{\theta} = \arg\min_{\theta} \left\lVert \hat{X} \theta - y \right\rVert_2^2 \]

In the Rust code, both least-squares problems are solved with a QR-based decomposition analogous to np.linalg.lstsq.

3 Helpers

def add_intercept(x):
    return np.column_stack([np.ones(x.shape[0]), x])


def build_designs(x_endog, x_exog, z_excluded):
    x_rhs = np.column_stack([x_endog, x_exog]) if x_exog.shape[1] > 0 else x_endog
    z_rhs = (
        np.column_stack([x_exog, z_excluded]) if x_exog.shape[1] > 0 else z_excluded
    )
    return add_intercept(x_rhs), add_intercept(z_rhs)


def twosls_via_lstsq(x_endog, x_exog, z_excluded, y):
    x_design, z_design = build_designs(x_endog, x_exog, z_excluded)
    pi_hat, *_ = np.linalg.lstsq(z_design, x_endog, rcond=None)
    x_endog_hat = z_design @ pi_hat
    x_hat_rhs = (
        np.column_stack([x_endog_hat, x_exog]) if x_exog.shape[1] > 0 else x_endog_hat
    )
    x_hat_design = add_intercept(x_hat_rhs)
    theta_hat, *_ = np.linalg.lstsq(x_hat_design, y, rcond=None)
    return theta_hat


def twosls_via_projection_formula(x_endog, x_exog, z_excluded, y):
    x_design, z_design = build_designs(x_endog, x_exog, z_excluded)
    ztz_inv = np.linalg.inv(z_design.T @ z_design)
    pz = z_design @ ztz_inv @ z_design.T
    return np.linalg.solve(x_design.T @ pz @ x_design, x_design.T @ pz @ y)

4 Single Endogenous Regressor

This is the usual overidentified IV setup: one endogenous regressor, one exogenous regressor, and three excluded instruments.

rng = np.random.default_rng(5)
n = 1200

intercept = 0.5
beta_endog = np.array([1.4])
beta_exog = np.array([-0.8])

z = rng.normal(size=(n, 3))
x_exog = rng.normal(size=(n, 1))
v = rng.normal(size=(n, 1))
eps = rng.normal(size=n)

x_endog = z @ np.array([[0.9], [0.45], [-0.3]]) + 0.3 * x_exog + v
u = 0.65 * v[:, 0] + 0.25 * eps
y = intercept + x_endog[:, 0] * beta_endog[0] + x_exog[:, 0] * beta_exog[0] + u

model = TwoSLS()
model.fit(x_endog, x_exog, z, y)
summary = model.summary()

theta_lstsq = twosls_via_lstsq(x_endog, x_exog, z, y)
theta_formula = twosls_via_projection_formula(x_endog, x_exog, z, y)

print("true parameters:", np.concatenate([[intercept], beta_endog, beta_exog]))
print("manual QR/lstsq path:", np.round(theta_lstsq, 4))
print("projection formula:", np.round(theta_formula, 4))
pprint(summary)

np.testing.assert_allclose(theta_lstsq, theta_formula, atol=1e-8, rtol=1e-8)
np.testing.assert_allclose(
    theta_lstsq,
    np.concatenate([[summary["intercept"]], np.asarray(summary["coef"])]),
    atol=1e-8,
    rtol=1e-8,
)
true parameters: [ 0.5  1.4 -0.8]
manual QR/lstsq path: [ 0.486   1.3787 -0.7785]
projection formula: [ 0.486   1.3787 -0.7785]
{'coef': array([ 1.3787, -0.7785]),
 'coef_se': array([0.0184, 0.0203]),
 'intercept': 0.48595950998190096,
 'intercept_se': 0.020869105726918507,
 'vcov_type': 'hc1'}

5 Robust Covariance Options

Point estimation stays closed form, but the inference surface is shared with OLS, Ridge, and FixedEffectsOLS.

clusters = np.repeat(np.arange(40, dtype=np.int64), n // 40)

vanilla = model.summary(vcov="vanilla")
hac = model.summary(vcov="newey_west", lags=4)
cluster = model.summary(vcov="cluster", clusters=clusters)

print("vanilla SE:", np.round(vanilla["coef_se"], 4))
print("Newey-West SE:", np.round(hac["coef_se"], 4))
print("cluster SE:", np.round(cluster["coef_se"], 4))
vanilla SE: [0.02   0.0215]
Newey-West SE: [0.0186 0.0196]
cluster SE: [0.0173 0.0206]

6 Multiple Endogenous Regressors

Now the first stage solves for a whole coefficient matrix \(\hat{\Pi}\), one column for each endogenous regressor. The second stage is still just least squares on

\[ \hat{X} = \begin{bmatrix} \mathbf{1} & \hat{X}_{\mathrm{endog}} & X_{\mathrm{exog}} \end{bmatrix}. \]

rng = np.random.default_rng(11)
n = 1400

intercept = -0.2
beta_endog = np.array([1.15, -0.9])
beta_exog = np.array([0.55])

z = rng.normal(size=(n, 3))
x_exog = rng.normal(size=(n, 1))
v = rng.normal(size=(n, 2))
eps = rng.normal(size=n)
pi = np.array(
    [
        [0.9, 0.15],
        [0.45, -0.2],
        [-0.3, 0.85],
    ]
)

x_endog = z @ pi + 0.25 * x_exog + v
u = 0.6 * v[:, 0] - 0.45 * v[:, 1] + 0.2 * eps
y = intercept + x_endog @ beta_endog + x_exog[:, 0] * beta_exog[0] + u

model = TwoSLS()
model.fit(x_endog, x_exog, z, y)
summary = model.summary()

theta_lstsq = twosls_via_lstsq(x_endog, x_exog, z, y)
theta_formula = twosls_via_projection_formula(x_endog, x_exog, z, y)

print("true parameters:", np.concatenate([[intercept], beta_endog, beta_exog]))
print("manual QR/lstsq path:", np.round(theta_lstsq, 4))
print("projection formula:", np.round(theta_formula, 4))
pprint(summary)

np.testing.assert_allclose(theta_lstsq, theta_formula, atol=1e-8, rtol=1e-8)
np.testing.assert_allclose(
    theta_lstsq,
    np.concatenate([[summary["intercept"]], np.asarray(summary["coef"])]),
    atol=1e-8,
    rtol=1e-8,
)
true parameters: [-0.2   1.15 -0.9   0.55]
manual QR/lstsq path: [-0.1713  1.138  -0.9109  0.5623]
projection formula: [-0.1713  1.138  -0.9109  0.5623]
{'coef': array([ 1.138 , -0.9109,  0.5623]),
 'coef_se': array([0.0208, 0.0227, 0.0214]),
 'intercept': -0.1712948159197605,
 'intercept_se': 0.02092673254202305,
 'vcov_type': 'hc1'}

7 Takeaway

  • The estimator in the package is still the standard 2SLS estimator.
  • The projector formula and the staged least-squares solve are algebraically equivalent.
  • The implementation follows the staged solve because QR-based least squares is a better numerical primitive than explicitly forming inverse-based normal equations for the point estimate.