crabbymetrics
  • Home
  • API
  • Binding Crash Course
  • Supervised Learning
    • OLS
    • Ridge
    • Fixed Effects OLS
    • ElasticNet
    • Synthetic Control
    • Logit
    • Multinomial Logit
    • Poisson
    • TwoSLS
    • GMM
    • FTRL
    • MEstimator Poisson
  • Semiparametrics
    • Balancing Weights
    • EPLM
    • Average Derivative
    • Double ML And AIPW
    • Richer Regression
  • Unsupervised Learning
    • PCA And Kernel Basis
  • Ablations
    • Variance Estimators
    • Semiparametric Estimator Comparisons
    • Bridging Finite And Superpopulation
  • Optimization
    • Optimizers
    • GMM With Optimizers
  • Ding: First Course
    • Overview And TOC
    • Ch 1 Correlation And Simpson
    • Ch 2 Potential Outcomes
    • Ch 3 CRE And Fisher RT
    • Ch 4 CRE And Neyman
    • Ch 9 Bridging Finite And Superpopulation
    • Ch 11 Propensity Score
    • Ch 12 Double Robust ATE
    • Ch 13 Double Robust ATT
    • Ch 21 Experimental IV
    • Ch 23 Econometric IV

OLS Example

This page mirrors examples/ols_example.py.

1 Fit A Basic Linear Model

import numpy as np
from pprint import pprint

from crabbymetrics import OLS

np.set_printoptions(precision=4, suppress=True)
rng = np.random.default_rng(0)
n = 500
k = 3
beta = np.array([1.5, -2.0, 0.5])
intercept = 0.7

x = rng.normal(size=(n, k))
y = intercept + x @ beta + rng.normal(scale=0.5, size=n)

model = OLS()
model.fit(x, y)

print("true intercept:", intercept)
print("true coef:", beta)
pprint(model.summary())
true intercept: 0.7
true coef: [ 1.5 -2.   0.5]
{'coef': array([ 1.5206, -1.9933,  0.5314]),
 'coef_se': array([0.0224, 0.0249, 0.0223]),
 'intercept': 0.668688281762462,
 'intercept_se': 0.023367266952062406,
 'vcov_type': 'hc1'}

2 Robust Covariance Options

OLS.summary() now shares the same covariance interface used by the other linear estimators:

  • vcov="vanilla" for homoskedastic standard errors
  • vcov="hc1" for heteroskedasticity-robust Eicker-Huber-White standard errors
  • vcov="newey_west" with a lag choice for HAC inference
  • vcov="cluster" with one-way cluster labels
clusters = np.repeat(np.arange(25, dtype=np.int64), n // 25)

vanilla = model.summary(vcov="vanilla")
hac = model.summary(vcov="newey_west", lags=4)
cluster = model.summary(vcov="cluster", clusters=clusters)

print("vanilla SE:", np.round(vanilla["coef_se"], 4))
print("Newey-West SE:", np.round(hac["coef_se"], 4))
print("cluster SE:", np.round(cluster["coef_se"], 4))
vanilla SE: [0.0236 0.0242 0.0227]
Newey-West SE: [0.0229 0.0245 0.02  ]
cluster SE: [0.0266 0.0214 0.0177]