OLS Example

This page mirrors examples/ols_example.py.

1 Fit A Basic Linear Model

import numpy as np
from pprint import pprint

from crabbymetrics import OLS

np.set_printoptions(precision=4, suppress=True)

rng = np.random.default_rng(0)
n = 500
k = 3
beta = np.array([1.5, -2.0, 0.5])
intercept = 0.7

x = rng.normal(size=(n, k))
y = intercept + x @ beta + rng.normal(scale=0.5, size=n)

model = OLS()
model.fit(x, y)

print("true intercept:", intercept)
print("true coef:", beta)
pprint(model.summary())

true intercept: 0.7
true coef: [ 1.5 -2.   0.5]
{'coef': array([ 1.5206, -1.9933,  0.5314]),
 'coef_se': array([0.0224, 0.0249, 0.0223]),
 'intercept': 0.668688281762462,
 'intercept_se': 0.023367266952062406,
 'vcov_type': 'hc1'}

2 Robust Covariance Options

OLS.summary() now shares the same covariance interface used by the other linear estimators:

vcov="vanilla" for homoskedastic standard errors
vcov="hc1" for heteroskedasticity-robust Eicker-Huber-White standard errors
vcov="newey_west" with a lag choice for HAC inference
vcov="cluster" with one-way cluster labels

clusters = np.repeat(np.arange(25, dtype=np.int64), n // 25)

vanilla = model.summary(vcov="vanilla")
hac = model.summary(vcov="newey_west", lags=4)
cluster = model.summary(vcov="cluster", clusters=clusters)

print("vanilla SE:", np.round(vanilla["coef_se"], 4))
print("Newey-West SE:", np.round(hac["coef_se"], 4))
print("cluster SE:", np.round(cluster["coef_se"], 4))

vanilla SE: [0.0236 0.0242 0.0227]
Newey-West SE: [0.0229 0.0245 0.02  ]
cluster SE: [0.0266 0.0214 0.0177]