crabbymetrics
  • Home
  • API
  • Binding Crash Course
  • Supervised Learning
    • OLS
    • Ridge
    • Fixed Effects OLS
    • ElasticNet
    • Synthetic Control
    • Logit
    • Multinomial Logit
    • Poisson
    • TwoSLS
    • GMM
    • FTRL
    • MEstimator Poisson
  • Semiparametrics
    • Balancing Weights
    • EPLM
    • Average Derivative
    • Double ML And AIPW
    • Richer Regression
  • Unsupervised Learning
    • PCA And Kernel Basis
  • Ablations
    • Variance Estimators
    • Semiparametric Estimator Comparisons
    • Bridging Finite And Superpopulation
  • Optimization
    • Optimizers
    • GMM With Optimizers
  • Ding: First Course
    • Overview And TOC
    • Ch 1 Correlation And Simpson
    • Ch 2 Potential Outcomes
    • Ch 3 CRE And Fisher RT
    • Ch 4 CRE And Neyman
    • Ch 9 Bridging Finite And Superpopulation
    • Ch 11 Propensity Score
    • Ch 12 Double Robust ATE
    • Ch 13 Double Robust ATT
    • Ch 21 Experimental IV
    • Ch 23 Econometric IV

On this page

  • 1 Overview
  • 2 Sitrep
  • 3 Public Surface
  • 4 Optimizer Surface
  • 5 Conventions
  • 6 Estimator Reference
  • 7 Transformer Reference
  • 8 Optimizer Reference
  • 9 Verified Runtime Snapshot
  • 10 Usage Examples
    • 10.1 OLS
    • 10.2 Ridge
    • 10.3 Poisson
    • 10.4 TwoSLS
    • 10.5 EPLM
    • 10.6 AverageDerivative
    • 10.7 PartiallyLinearDML
    • 10.8 AIPW
    • 10.9 GMM
    • 10.10 FixedEffectsOLS
    • 10.11 MEstimator
  • 11 Current Sharp Edges

crabbymetrics API Reference

Library sitrep and verified Python surface

1 Overview

crabbymetrics is a compact Rust-backed econometrics library exposed to Python through pyo3 and packaged with maturin. The public API is intentionally small: estimator classes, transformer classes, an Optimizers helper namespace, numpy arrays as the only runtime dependency, and a scikit-adjacent fit / predict / summary pattern.

The page below combines two things:

  • A sitrep on the current state of the library.
  • A verified API reference rendered against the live Python module in this repository.
Show code
import inspect
from html import escape

import numpy as np
import crabbymetrics as cm
from IPython.display import HTML, display


def html_table(headers, rows, table_attrs=""):
    parts = [
        f"<table {table_attrs}>".strip(),
        "<thead>",
        "<tr>",
        *[f"<th>{escape(str(header))}</th>" for header in headers],
        "</tr>",
        "</thead>",
        "<tbody>",
    ]
    for row in rows:
        parts.append("<tr>")
        for cell in row:
            parts.append(f"<td>{cell}</td>")
        parts.append("</tr>")
    parts.extend(["</tbody>", "</table>"])
    return "".join(parts)

2 Sitrep

  • The implementation is concentrated in two Rust files: src/lib.rs registers the Python classes and src/estimators.rs holds nearly all estimator logic.
  • Shared linear algebra, covariance, bootstrap, and NumPy conversion helpers live in src/utils.rs.
  • Packaging is lean. The Python package declares only numpy at runtime, while native builds are handled by maturin.
  • Release automation already exists in .github/workflows/wheels.yml for Linux and macOS wheels across Python 3.10 to 3.14.
  • Coverage is broad for a small codebase: OLS, Ridge, fixed-effects OLS, synthetic control, balancing weights, EPLM, average-derivative estimators, partially linear DML, AIPW, ElasticNet, binary and multinomial logit, Poisson, TwoSLS, FTRL, a first-class callback-driven GMM, a lower-level MEstimator, plus PCA and KernelBasis as design-matrix preprocessors.
  • The module also exposes an Optimizers class with static methods for LBFGS, BFGS, nonlinear conjugate gradient, Gauss-Newton least squares, and simulated annealing.
  • The main maturity gaps are still API breadth and test depth. The public contract is now documented, but the estimator surface is intentionally narrow.
  • The public API is consistent, but not fully uniform. summary() always returns a plain Python dict, yet the exact keys vary by estimator.
  • TwoSLS now uses the closed-form linear IV / 2SLS estimator and accepts matrix-valued endogenous regressors and excluded instruments.
  • GMM now sits between the built-in estimators and MEstimator: users supply per-observation moments, optionally a sample Jacobian, and the class handles just-identified versus two-step fitting plus summary-time covariance choices.
  • MEstimator is still the lowest-level likelihood-style interface. Users are responsible for gradients, per-observation score matrices, and numerical safeguards inside their objective functions.

3 Public Surface

The table below is generated from the live module so the constructor and method signatures match the current build.

Show code
classes = [
    cm.OLS,
    cm.Ridge,
    cm.FixedEffectsOLS,
    cm.SyntheticControl,
    cm.BalancingWeights,
    cm.EPLM,
    cm.AverageDerivative,
    cm.PartiallyLinearDML,
    cm.AIPW,
    cm.ElasticNet,
    cm.Logit,
    cm.MultinomialLogit,
    cm.Poisson,
    cm.TwoSLS,
    cm.GMM,
    cm.FTRL,
    cm.MEstimator,
    cm.PCA,
    cm.KernelBasis,
    cm.Optimizers,
]

rows = []
for cls in classes:
    methods = []
    for method_name in sorted(name for name in dir(cls) if not name.startswith("_")):
        method = getattr(cls, method_name)
        if not callable(method):
            continue
        methods.append(
            f"<code>{escape(method_name)}{escape(str(inspect.signature(method)))}</code>"
        )
    rows.append(
        [
            f"<code>{escape(cls.__name__)}{escape(str(inspect.signature(cls)))}</code>",
            "<br>".join(methods),
        ]
    )

display(HTML(html_table(["Constructor", "Methods"], rows)))
Constructor Methods
OLS() bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x, y)
fit_weighted(self, /, x, y, sample_weight)
predict(self, /, x)
summary(self, /, vcov='hc1', lags=None, clusters=None)
Ridge(penalty=None, cv=5) bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x, y)
fit_weighted(self, /, x, y, sample_weight)
predict(self, /, x)
summary(self, /, vcov='hc1', lags=None, clusters=None)
FixedEffectsOLS() bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x, fe, y)
fit_weighted(self, /, x, fe, y, sample_weight)
summary(self, /, vcov='hc1', lags=None, clusters=None)
SyntheticControl(max_iterations=500) bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, donors, treated)
predict(self, /, donors)
summary(self, /)
BalancingWeights(objective='quadratic', solver='auto', autoscale=False, min_weight=0.0, max_weight=1.0, l2_norm=0.0, max_iterations=200, tolerance=1e-08) fit(self, /, covariates, target_covariates, baseline_weights=None, target_weights=None)
get_weights(self, /)
summary(self, /)
EPLM(fd_eps=1e-06) fit(self, /, y, d, w)
summary(self, /, vcov=None, lags=None, clusters=None)
AverageDerivative(method='dr', fd_eps=1e-06) fit(self, /, y, d, w)
summary(self, /, vcov=None, lags=None, clusters=None)
PartiallyLinearDML(penalty=None, cv=5, n_folds=5, seed=42) fit(self, /, y, d, x)
summary(self, /, vcov=None, lags=None, clusters=None)
AIPW(penalty=None, cv=5, n_folds=5, propensity_clip=0.02, seed=42) fit(self, /, y, d, x)
summary(self, /, vcov=None, lags=None, clusters=None)
ElasticNet(penalty=1.0, l1_ratio=0.5, tolerance=0.0001, max_iterations=1000) bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x, y)
predict(self, /, x)
summary(self, /)
Logit(alpha=1.0, max_iterations=100, gradient_tolerance=0.0001) bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x, y)
predict(self, /, x)
summary(self, /)
MultinomialLogit(alpha=1.0, max_iterations=100, gradient_tolerance=0.0001) bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x, y)
predict(self, /, x)
summary(self, /)
Poisson(alpha=0.0, max_iterations=100, tolerance=0.0001) bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x, y)
predict(self, /, x)
summary(self, /, vcov='vanilla')
TwoSLS() bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x_endog, x_exog, z, y)
fit_weighted(self, /, x_endog, x_exog, z, y, sample_weight)
predict(self, /, x)
summary(self, /, vcov='hc1', lags=None, clusters=None)
GMM(moment_fn, jacobian_fn=None, max_iterations=100, tolerance=1e-06, ridge=1e-08, fd_eps=1e-06) fit(self, /, data, theta0, weighting='auto')
summary(self, /, vcov='sandwich', omega='iid', lags=None, clusters=None)
FTRL(alpha=0.1, beta=1.0, l1_ratio=1.0, l2_ratio=1.0) bootstrap(self, /, n_bootstrap, seed=None)
fit(self, /, x, y)
predict(self, /, x)
summary(self, /)
MEstimator(objective_fn, score_fn, max_iterations=100, tolerance=1e-06) bootstrap(self, /, n_bootstrap, seed=None)
compute_vcov(self, /)
fit(self, /, data, theta0)
summary(self, /)
PCA(n_components, whiten=False) fit(self, /, x)
fit_transform(self, /, x)
inverse_transform(self, /, scores)
summary(self, /)
transform(self, /, x)
KernelBasis(kernel='gaussian', bandwidth=0.5, coef0=1.0, degree=2.0) fit(self, /, x)
fit_transform(self, /, x)
summary(self, /)
transform(self, /, x)
Optimizers() minimize_bfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)
minimize_gauss_newton_ls(residual_fn, x0, jacobian_fn, max_iterations=100, tolerance=1e-06)
minimize_lbfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)
minimize_nonlinear_cg(fun, x0, grad, max_iterations=100, restart_iters=10, restart_orthogonality=0.1, tolerance=1e-06)
minimize_simulated_annealing(fun, x0, lower=None, upper=None, temp=15.0, step_size=0.1, max_iterations=5000, seed=None)

4 Optimizer Surface

The optimizer helpers live under cm.Optimizers as static methods. They mirror the usual scipy pattern: pass a Python callback, a starting value, and any derivative information the method needs, then receive a plain dictionary with x, fun, nit, success, message, and method.

Show code
optimizer_rows = []
for name in sorted(n for n in dir(cm.Optimizers) if n.startswith("minimize_")):
    fn = getattr(cm.Optimizers, name)
    optimizer_rows.append(
        [f"<code>Optimizers.{escape(name)}{escape(str(inspect.signature(fn)))}</code>"]
    )

display(HTML(html_table(["Function"], optimizer_rows)))
Function
Optimizers.minimize_bfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)
Optimizers.minimize_gauss_newton_ls(residual_fn, x0, jacobian_fn, max_iterations=100, tolerance=1e-06)
Optimizers.minimize_lbfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)
Optimizers.minimize_nonlinear_cg(fun, x0, grad, max_iterations=100, restart_iters=10, restart_orthogonality=0.1, tolerance=1e-06)
Optimizers.minimize_simulated_annealing(fun, x0, lower=None, upper=None, temp=15.0, step_size=0.1, max_iterations=5000, seed=None)

5 Conventions

  • Input features are always 2D NumPy arrays.
  • Regression targets use float64 arrays.
  • PCA and KernelBasis fit on x alone and return transformed feature matrices that can be passed into the regression estimators.
  • The Optimizers methods return plain dictionaries rather than scipy result objects, but the schema is intentionally similar.
  • FixedEffectsOLS.fit(x, fe, y) expects fe to be a 2D uint32 matrix of zero-based category codes, one column per factor.
  • SyntheticControl.fit(donors, treated) expects donor outcomes in an (n_periods, n_donors) matrix and a treated pre-period outcome vector of length n_periods.
  • BalancingWeights.fit(covariates, target_covariates, baseline_weights=None, target_weights=None) estimates calibration weights on the first matrix so its weighted covariate mean matches the target mean implied by the second matrix.
  • EPLM.fit(y, d, w) and AverageDerivative.fit(y, d, w) currently target a scalar continuous treatment and a 2D control matrix.
  • PartiallyLinearDML.fit(y, d, x) targets a scalar continuous treatment with cross-fit ridge nuisances.
  • AIPW.fit(y, d, x) targets a binary treatment coded as 0/1 and estimates the ATE with cross-fit ridge nuisance models.
  • Logit, MultinomialLogit, and FTRL expect integer class labels at fit time.
  • Most estimators store their training data internally so summary() and bootstrap() can be called after fit().
  • OLS, Ridge, ElasticNet, Logit, MultinomialLogit, Poisson, and TwoSLS now always fit an intercept.
  • FixedEffectsOLS, EPLM, AverageDerivative, PartiallyLinearDML, AIPW, GMM, and MEstimator deliberately have no predict() method. They are estimation-first interfaces.
  • When an estimator has an intercept, bootstrap draws place it in the first column.
  • summary() returns dictionaries, not rich result objects. Downstream code should index by key.
  • bootstrap(n_bootstrap, seed=None) now accepts an omitted seed across the full public surface.
  • OLS, Ridge, FixedEffectsOLS, and TwoSLS now share the same robust-inference surface: summary(vcov="vanilla" | "hc1" | "newey_west" | "cluster", lags=None, clusters=None).

6 Estimator Reference

Show code
reference_rows = [
    [
        "<code>OLS</code>",
        "<code>fit(x, y)</code>",
        "1D float predictions",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>",
        "Uses Linfa linear regression; <code>summary(vcov='hc1')</code> is default, and the linear-summary surface also accepts <code>'vanilla'</code>, <code>'newey_west'</code>, and <code>'cluster'</code>.",
    ],
    [
        "<code>Ridge</code>",
        "<code>fit(x, y)</code>",
        "1D float predictions",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>, <code>penalty</code>, <code>penalties</code>, <code>best_penalty_index</code>, <code>cv_mse</code>, <code>intercept_path</code>, <code>coef_path</code>",
        "<code>(B, p + 1)</code> with intercept",
        "Closed-form L2-regularized least squares solved through an augmented QR least-squares system. Passing a penalty grid runs cross-validation, stores the full coefficient path, refits the selected penalty on the full sample, and exposes the same linear-summary covariance options as OLS.",
    ],
    [
        "<code>FixedEffectsOLS</code>",
        "<code>fit(x, fe_uint32, y)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>coef_se</code>",
        "<code>(B, p)</code>",
        "Uses <code>within</code> to partial out fixed effects, then Linfa OLS with <code>fit_intercept=False</code> on the residualized design. The summary covariance options match the other linear estimators.",
    ],
    [
        "<code>SyntheticControl</code>",
        "<code>fit(donors, treated)</code>",
        "1D float synthetic path from donor outcomes",
        "<code>weights</code>, <code>pre_rmse</code>",
        "<code>(B, n_donors)</code>",
        "Fits nonnegative donor weights that sum to one by minimizing pre-treatment squared error under a simplex constraint.",
    ],
    [
        "<code>BalancingWeights</code>",
        "<code>fit(covariates, target_covariates, baseline_weights=None, target_weights=None)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>weights</code>, <code>beta</code>, <code>weighted_mean</code>, <code>target_mean</code>, <code>mean_diff</code>, <code>l2_diff</code>, <code>effective_sample_size</code>, <code>success</code>",
        "No bootstrap method",
        "Dual calibration solver for entropy and quadratic balancing weights. Intended for ATT-style reweighting, covariate shift, and domain adaptation tasks where one sample is reweighted to match another sample's covariate mean.",
    ],
    [
        "<code>EPLM</code>",
        "<code>fit(y, d, w)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>nuisance_coef</code>",
        "No bootstrap method",
        "Robins-Newey style partially linear E-estimator implemented as an exactly identified stacked-moment system with a linear working model for <code>E[D|W]</code>.",
    ],
    [
        "<code>AverageDerivative</code>",
        "<code>fit(y, d, w)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>method</code>, <code>coef</code>, <code>se</code>, <code>vcov</code>",
        "No bootstrap method",
        "Graham-Pinto style average-derivative estimator with <code>method='ob' | 'ipw' | 'dr'</code> under a scalar continuous-treatment working model.",
    ],
    [
        "<code>PartiallyLinearDML</code>",
        "<code>fit(y, d, x)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>outcome_penalties</code>, <code>treatment_penalties</code>",
        "No bootstrap method",
        "Cross-fit ridge implementation of the partially linear Double-ML score. The nuisance fits are selected fold by fold from a common ridge penalty grid.",
    ],
    [
        "<code>AIPW</code>",
        "<code>fit(y, d, x)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>ate</code>, <code>se</code>, <code>vcov</code>, <code>outcome0_penalties</code>, <code>outcome1_penalties</code>, <code>propensity_penalties</code>",
        "No bootstrap method",
        "Cross-fit ridge AIPW estimator for a binary-treatment ATE. Outcome regressions are fit separately by treatment arm and the ridge propensity fit is clipped away from 0 and 1.",
    ],
    [
        "<code>ElasticNet</code>",
        "<code>fit(x, y)</code>",
        "1D float predictions",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>",
        "Regularized point estimates with HC1-style covariance built from residuals.",
    ],
    [
        "<code>Logit</code>",
        "<code>fit(x, y_int32)</code>",
        "1D <code>int32</code> class labels",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>",
        "Binary logistic regression. Standard errors come from the Fisher information.",
    ],
    [
        "<code>MultinomialLogit</code>",
        "<code>fit(x, y_int32)</code>",
        "1D <code>int32</code> class labels",
        "<code>coef</code> and <code>se</code> matrices of shape <code>(classes, features_with_intercept)</code>",
        "<code>(B, classes * features_with_intercept)</code>",
        "Intercept and slopes are packed together by class in the summary matrix.",
    ],
    [
        "<code>Poisson</code>",
        "<code>fit(x, y)</code>",
        "1D float mean counts",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>",
        "Custom Newton-CG Poisson MLE with optional L2 penalty via <code>alpha</code>; <code>summary(vcov='vanilla')</code> gives Fisher SEs and <code>summary(vcov='sandwich')</code> gives QMLE sandwich SEs.",
    ],
    [
        "<code>TwoSLS</code>",
        "<code>fit(x_endog, x_exog, z, y)</code>",
        "1D float predictions from the stage-two design matrix",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, k + 1)</code> with intercept, else <code>(B, k)</code>",
        "Closed-form linear IV / 2SLS estimator. Supports multiple endogenous regressors, and its summary method now shares the same <code>vanilla</code>, <code>hc1</code>, <code>newey_west</code>, and <code>cluster</code> covariance interface as the other linear estimators.",
    ],
    [
        "<code>FTRL</code>",
        "<code>fit(x, y_int32)</code>",
        "1D float scores in <code>[0, 1]</code>",
        "<code>coef</code>, <code>coef_se</code>",
        "<code>(B, p)</code>",
        "Online-style classification model with a different summary schema and no explicit intercept output.",
    ],
    [
        "<code>GMM</code>",
        "<code>fit(data_dict, theta0, weighting='auto')</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>criterion</code>, <code>nit</code>, <code>weighting</code>, <code>vcov_type</code>, <code>omega_type</code>, <code>weight_matrix</code>, <code>nobs</code>, <code>n_moments</code>, <code>j_stat</code>, <code>j_df</code>",
        "No bootstrap method",
        "Requires <code>moment_fn(theta, data)</code> returning an <code>(n, m)</code> matrix of per-observation moments and optionally <code>jacobian_fn(theta, data)</code> returning the sample Jacobian of shape <code>(m, p)</code>.",
    ],
    [
        "<code>MEstimator</code>",
        "<code>fit(data_dict, theta0)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>se</code>",
        "<code>(B, p)</code>",
        "Requires <code>objective_fn(theta, data) -&gt; (obj, grad)</code> and <code>score_fn(theta, data)</code> returning an <code>(n, p)</code> matrix.",
    ],
]

display(
    HTML(
        "<div style='width: 100%; overflow-x: auto;'>"
        + html_table(
            ["Estimator", "Fit", "Predict", "Summary Keys", "Bootstrap", "Notes"],
            reference_rows,
            table_attrs="style='width: 100%; min-width: 100%; table-layout: auto;'",
        )
        + "</div>"
    )
)
Estimator Fit Predict Summary Keys Bootstrap Notes
OLS fit(x, y) 1D float predictions intercept, coef, intercept_se, coef_se (B, p + 1) with intercept, else (B, p) Uses Linfa linear regression; summary(vcov='hc1') is default, and the linear-summary surface also accepts 'vanilla', 'newey_west', and 'cluster'.
Ridge fit(x, y) 1D float predictions intercept, coef, intercept_se, coef_se, penalty, penalties, best_penalty_index, cv_mse, intercept_path, coef_path (B, p + 1) with intercept Closed-form L2-regularized least squares solved through an augmented QR least-squares system. Passing a penalty grid runs cross-validation, stores the full coefficient path, refits the selected penalty on the full sample, and exposes the same linear-summary covariance options as OLS.
FixedEffectsOLS fit(x, fe_uint32, y) No dedicated predict() method coef, coef_se (B, p) Uses within to partial out fixed effects, then Linfa OLS with fit_intercept=False on the residualized design. The summary covariance options match the other linear estimators.
SyntheticControl fit(donors, treated) 1D float synthetic path from donor outcomes weights, pre_rmse (B, n_donors) Fits nonnegative donor weights that sum to one by minimizing pre-treatment squared error under a simplex constraint.
BalancingWeights fit(covariates, target_covariates, baseline_weights=None, target_weights=None) No dedicated predict() method weights, beta, weighted_mean, target_mean, mean_diff, l2_diff, effective_sample_size, success No bootstrap method Dual calibration solver for entropy and quadratic balancing weights. Intended for ATT-style reweighting, covariate shift, and domain adaptation tasks where one sample is reweighted to match another sample's covariate mean.
EPLM fit(y, d, w) No dedicated predict() method coef, se, vcov, nuisance_coef No bootstrap method Robins-Newey style partially linear E-estimator implemented as an exactly identified stacked-moment system with a linear working model for E[D|W].
AverageDerivative fit(y, d, w) No dedicated predict() method method, coef, se, vcov No bootstrap method Graham-Pinto style average-derivative estimator with method='ob' | 'ipw' | 'dr' under a scalar continuous-treatment working model.
PartiallyLinearDML fit(y, d, x) No dedicated predict() method coef, se, vcov, outcome_penalties, treatment_penalties No bootstrap method Cross-fit ridge implementation of the partially linear Double-ML score. The nuisance fits are selected fold by fold from a common ridge penalty grid.
AIPW fit(y, d, x) No dedicated predict() method ate, se, vcov, outcome0_penalties, outcome1_penalties, propensity_penalties No bootstrap method Cross-fit ridge AIPW estimator for a binary-treatment ATE. Outcome regressions are fit separately by treatment arm and the ridge propensity fit is clipped away from 0 and 1.
ElasticNet fit(x, y) 1D float predictions intercept, coef, intercept_se, coef_se (B, p + 1) with intercept, else (B, p) Regularized point estimates with HC1-style covariance built from residuals.
Logit fit(x, y_int32) 1D int32 class labels intercept, coef, intercept_se, coef_se (B, p + 1) with intercept, else (B, p) Binary logistic regression. Standard errors come from the Fisher information.
MultinomialLogit fit(x, y_int32) 1D int32 class labels coef and se matrices of shape (classes, features_with_intercept) (B, classes * features_with_intercept) Intercept and slopes are packed together by class in the summary matrix.
Poisson fit(x, y) 1D float mean counts intercept, coef, intercept_se, coef_se (B, p + 1) with intercept, else (B, p) Custom Newton-CG Poisson MLE with optional L2 penalty via alpha; summary(vcov='vanilla') gives Fisher SEs and summary(vcov='sandwich') gives QMLE sandwich SEs.
TwoSLS fit(x_endog, x_exog, z, y) 1D float predictions from the stage-two design matrix intercept, coef, intercept_se, coef_se (B, k + 1) with intercept, else (B, k) Closed-form linear IV / 2SLS estimator. Supports multiple endogenous regressors, and its summary method now shares the same vanilla, hc1, newey_west, and cluster covariance interface as the other linear estimators.
FTRL fit(x, y_int32) 1D float scores in [0, 1] coef, coef_se (B, p) Online-style classification model with a different summary schema and no explicit intercept output.
GMM fit(data_dict, theta0, weighting='auto') No dedicated predict() method coef, se, vcov, criterion, nit, weighting, vcov_type, omega_type, weight_matrix, nobs, n_moments, j_stat, j_df No bootstrap method Requires moment_fn(theta, data) returning an (n, m) matrix of per-observation moments and optionally jacobian_fn(theta, data) returning the sample Jacobian of shape (m, p).
MEstimator fit(data_dict, theta0) No dedicated predict() method coef, se (B, p) Requires objective_fn(theta, data) -> (obj, grad) and score_fn(theta, data) returning an (n, p) matrix.

7 Transformer Reference

Show code
transformer_rows = [
    [
        "<code>PCA</code>",
        "<code>fit(x)</code>",
        "<code>transform(x)</code>, <code>fit_transform(x)</code>, <code>inverse_transform(scores)</code>",
        "<code>n_components</code>, <code>n_features</code>, <code>whiten</code>, <code>components</code>, <code>mean</code>, <code>explained_variance</code>, <code>explained_variance_ratio</code>, <code>singular_values</code>",
        "Principal-components basis estimated from <code>x</code> alone. Useful for principal-components regression and other low-rank pipelines.",
    ],
    [
        "<code>KernelBasis</code>",
        "<code>fit(x)</code>",
        "<code>transform(x)</code>, <code>fit_transform(x)</code>",
        "<code>kernel</code>, <code>n_train</code>, <code>n_features</code>, <code>bandwidth</code>, <code>coef0</code>, <code>degree</code>, <code>diagonal</code>",
        "Stores the fitted training design and returns kernel features against that basis, so downstream regressions can work on nonlinear right-hand sides.",
    ],
]

display(
    HTML(
        html_table(
            ["Transformer", "Fit", "Transform", "Summary Keys", "Notes"],
            transformer_rows,
        )
    )
)
Transformer Fit Transform Summary Keys Notes
PCA fit(x) transform(x), fit_transform(x), inverse_transform(scores) n_components, n_features, whiten, components, mean, explained_variance, explained_variance_ratio, singular_values Principal-components basis estimated from x alone. Useful for principal-components regression and other low-rank pipelines.
KernelBasis fit(x) transform(x), fit_transform(x) kernel, n_train, n_features, bandwidth, coef0, degree, diagonal Stores the fitted training design and returns kernel features against that basis, so downstream regressions can work on nonlinear right-hand sides.

8 Optimizer Reference

Show code
optimizer_reference_rows = [
    [
        "<code>Optimizers.minimize_lbfgs</code>",
        "<code>fun(theta) -> float</code>, <code>grad(theta) -> (p,)</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Limited-memory quasi-Newton method for smooth unconstrained objectives.",
    ],
    [
        "<code>Optimizers.minimize_bfgs</code>",
        "<code>fun(theta) -> float</code>, <code>grad(theta) -> (p,)</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Dense quasi-Newton method with an explicit inverse-Hessian approximation.",
    ],
    [
        "<code>Optimizers.minimize_nonlinear_cg</code>",
        "<code>fun(theta) -> float</code>, <code>grad(theta) -> (p,)</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Conjugate-gradient optimizer with Armijo backtracking and a gradient-norm success check.",
    ],
    [
        "<code>Optimizers.minimize_gauss_newton_ls</code>",
        "<code>residual(theta) -> (n,)</code>, <code>jacobian(theta) -> (n, p)</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Least-squares Gauss-Newton routine for moment or residual systems.",
    ],
    [
        "<code>Optimizers.minimize_simulated_annealing</code>",
        "<code>fun(theta) -> float</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Stochastic global optimizer with optional box bounds, temperature, and RNG seed.",
    ],
]

display(
    HTML(
        html_table(
            ["Function", "Required Callbacks", "Returns", "Notes"],
            optimizer_reference_rows,
        )
    )
)
Function Required Callbacks Returns Notes
Optimizers.minimize_lbfgs fun(theta) -> float, grad(theta) -> (p,) x, fun, nit, success, message, method Limited-memory quasi-Newton method for smooth unconstrained objectives.
Optimizers.minimize_bfgs fun(theta) -> float, grad(theta) -> (p,) x, fun, nit, success, message, method Dense quasi-Newton method with an explicit inverse-Hessian approximation.
Optimizers.minimize_nonlinear_cg fun(theta) -> float, grad(theta) -> (p,) x, fun, nit, success, message, method Conjugate-gradient optimizer with Armijo backtracking and a gradient-norm success check.
Optimizers.minimize_gauss_newton_ls residual(theta) -> (n,), jacobian(theta) -> (n, p) x, fun, nit, success, message, method Least-squares Gauss-Newton routine for moment or residual systems.
Optimizers.minimize_simulated_annealing fun(theta) -> float x, fun, nit, success, message, method Stochastic global optimizer with optional box bounds, temperature, and RNG seed.

9 Verified Runtime Snapshot

The next chunk fits each estimator on synthetic data and records the key summary schema and bootstrap output shape. This acts as a lightweight executable check that the reference above matches the current build.

Show code
rng = np.random.default_rng(123)
runtime_rows = []

# OLS
x = rng.normal(size=(40, 3))
y = 0.5 + x @ np.array([1.0, -2.0, 0.3]) + rng.normal(scale=0.2, size=40)
model = cm.OLS()
model.fit(x, y)
runtime_rows.append(
    [
        "<code>OLS</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3).shape),
    ]
)

# Ridge
model = cm.Ridge(penalty=np.array([0.0, 0.1, 1.0]), cv=3)
model.fit(x, y)
runtime_rows.append(
    [
        "<code>Ridge</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# FixedEffectsOLS
n = 120
x = rng.normal(size=(n, 2))
worker = rng.integers(0, 20, size=n, dtype=np.uint32)
firm = rng.integers(0, 12, size=n, dtype=np.uint32)
fe = np.column_stack([worker, firm]).astype(np.uint32)
y = (
    x @ np.array([1.1, -0.4])
    + rng.normal(size=20)[worker]
    + rng.normal(size=12)[firm]
    + rng.normal(scale=0.2, size=n)
)
model = cm.FixedEffectsOLS()
model.fit(x, fe, y)
runtime_rows.append(
    [
        "<code>FixedEffectsOLS</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# ElasticNet
model = cm.ElasticNet(penalty=0.1, l1_ratio=0.5)
model.fit(x, y)
runtime_rows.append(
    [
        "<code>ElasticNet</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# SyntheticControl
donors_pre = rng.normal(size=(50, 3))
weights_true = np.array([0.5, 0.35, 0.15])
treated_pre = donors_pre @ weights_true
model = cm.SyntheticControl(max_iterations=300)
model.fit(donors_pre, treated_pre)
runtime_rows.append(
    [
        "<code>SyntheticControl</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# Logit
x = rng.normal(size=(60, 3))
eta = -0.2 + x @ np.array([0.8, -0.5, 1.2])
prob = 1.0 / (1.0 + np.exp(-eta))
y_bin = rng.binomial(1, prob, size=60).astype(np.int32)
model = cm.Logit(max_iterations=200)
model.fit(x, y_bin)
runtime_rows.append(
    [
        "<code>Logit</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# MultinomialLogit
x = rng.normal(size=(80, 2))
coef = np.array([[0.5, -0.7], [-0.3, 0.4], [0.2, 0.6]])
intercept = np.array([0.2, -0.1, 0.0])
logits = x @ coef.T + intercept
weights = np.exp(logits - logits.max(axis=1, keepdims=True))
prob = weights / weights.sum(axis=1, keepdims=True)
y_multi = np.array([rng.choice(3, p=row) for row in prob], dtype=np.int32)
model = cm.MultinomialLogit(max_iterations=200)
model.fit(x, y_multi)
runtime_rows.append(
    [
        "<code>MultinomialLogit</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# Poisson
x = rng.normal(size=(70, 2))
mu = np.exp(0.1 + x @ np.array([0.4, -0.2]))
y_pois = rng.poisson(mu).astype(float)
model = cm.Poisson(max_iterations=200)
model.fit(x, y_pois)
runtime_rows.append(
    [
        "<code>Poisson</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# TwoSLS
n = 80
z = rng.normal(size=(n, 2))
x_exog = rng.normal(size=(n, 1))
v = rng.normal(size=(n, 2))
u = 0.5 * v[:, 0] - 0.35 * v[:, 1] + rng.normal(scale=0.1, size=n)
x_endog = z @ np.array([[0.9, 0.1], [-0.25, 0.8]]) + 0.2 * x_exog + v
y_iv = 0.4 + x_endog @ np.array([1.3, -0.7]) - 0.6 * x_exog[:, 0] + u
model = cm.TwoSLS()
model.fit(x_endog, x_exog, z, y_iv)
runtime_rows.append(
    [
        "<code>TwoSLS</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# FTRL
x = rng.normal(size=(80, 4))
eta = x @ np.array([0.7, -0.4, 0.2, 0.3])
prob = 1.0 / (1.0 + np.exp(-eta))
y_ftrl = rng.binomial(1, prob, size=80).astype(np.int32)
model = cm.FTRL()
model.fit(x, y_ftrl)
runtime_rows.append(
    [
        "<code>FTRL</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# GMM
def gmm_moments(theta, data):
    X = data["X"]
    y = data["y"]
    centered = X[:, 0] - theta[0]
    resid = y - theta[1] - theta[2] * centered
    return np.column_stack([X[:, 0] - theta[0], resid, centered * resid])


def gmm_jacobian(theta, data):
    X = data["X"]
    y = data["y"]
    centered = X[:, 0] - theta[0]
    resid = y - theta[1] - theta[2] * centered
    return np.array(
        [
            [-1.0, 0.0, 0.0],
            [theta[2], -1.0, -centered.mean()],
            [(-resid + theta[2] * centered).mean(), -centered.mean(), -(centered**2).mean()],
        ]
    )


x = rng.normal(size=(60, 1))
y = 1.2 + 0.8 * (x[:, 0] - x[:, 0].mean()) + rng.normal(scale=0.2, size=60)
model = cm.GMM(gmm_moments, jacobian_fn=gmm_jacobian, max_iterations=200)
model.fit({"X": x, "y": y}, np.zeros(3))
runtime_rows.append(
    [
        "<code>GMM</code>",
        ", ".join(model.summary().keys()),
        "n/a",
    ]
)

# MEstimator
def ols_objective(theta, data):
    X = data["X"]
    y = data["y"]
    indices = data.get("indices", np.arange(len(y)))
    Xb = X[indices]
    yb = y[indices]
    residual = yb - Xb @ theta
    return 0.5 * np.sum(residual**2), -(Xb.T @ residual)


def ols_scores(theta, data):
    X = data["X"]
    y = data["y"]
    residual = y - X @ theta
    return -X * residual[:, None]


x = rng.normal(size=(50, 3))
y = x @ np.array([1.0, -0.5, 0.2]) + rng.normal(scale=0.2, size=50)
model = cm.MEstimator(ols_objective, ols_scores, max_iterations=200)
model.fit({"X": x, "y": y, "n": len(y)}, np.zeros(x.shape[1]))
runtime_rows.append(
    [
        "<code>MEstimator</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

display(HTML(html_table(["Estimator", "Summary Keys", "Bootstrap Shape"], runtime_rows)))
Estimator Summary Keys Bootstrap Shape
OLS intercept, coef, intercept_se, coef_se, vcov_type (3, 4)
Ridge intercept, coef, intercept_se, coef_se, penalty, penalties, vcov_type, best_penalty_index, cv_mse, intercept_path, coef_path (3, 4)
FixedEffectsOLS coef, coef_se, vcov_type (3, 2)
ElasticNet intercept, coef, intercept_se, coef_se (3, 3)
SyntheticControl weights, pre_rmse (3, 3)
Logit intercept, coef, intercept_se, coef_se (3, 4)
MultinomialLogit coef, se (3, 9)
Poisson intercept, coef, intercept_se, coef_se, vcov_type (3, 3)
TwoSLS intercept, coef, intercept_se, coef_se, vcov_type (3, 4)
FTRL coef, coef_se (3, 4)
GMM coef, se, vcov, criterion, nit, weighting, vcov_type, omega_type, weight_matrix, nobs, n_moments, j_stat, j_df n/a
MEstimator coef, se (3, 3)

10 Usage Examples

10.1 OLS

rng = np.random.default_rng(0)
x = rng.normal(size=(200, 3))
beta = np.array([1.2, -0.7, 0.5])
y = 0.4 + x @ beta + rng.normal(scale=0.2, size=200)

model = cm.OLS()
model.fit(x, y)

summary = model.summary()
summary_vanilla = model.summary(vcov="vanilla")
summary_hac = model.summary(vcov="newey_west", lags=4)
clusters = np.repeat(np.arange(20, dtype=np.int64), len(y) // 20)
summary_cluster = model.summary(vcov="cluster", clusters=clusters)
bootstrap_draws = model.bootstrap(5, seed=42)

summary["coef"], summary_vanilla["coef_se"], summary_hac["coef_se"], summary_cluster["coef_se"]
(array([ 1.18200348, -0.71923052,  0.51060512]),
 array([0.01415748, 0.01431653, 0.01472891]),
 array([0.01376542, 0.01521404, 0.01344638]),
 array([0.01486123, 0.01548593, 0.01560255]))

10.2 Ridge

rng = np.random.default_rng(2)
x = rng.normal(size=(250, 4))
beta = np.array([1.0, -0.8, 0.0, 0.3])
y = 0.6 + x @ beta + rng.normal(scale=0.4, size=250)

model = cm.Ridge(penalty=np.array([0.0, 0.05, 0.2, 1.0]), cv=5)
model.fit(x, y)

summary = model.summary()
summary_hac = model.summary(vcov="newey_west", lags=4)

summary["penalty"], summary["best_penalty_index"], summary_hac["coef_se"], summary["coef_path"][:, :2]
(0.0,
 0,
 array([0.01932908, 0.02579194, 0.02352418, 0.02311446]),
 array([[ 0.98320114,  0.9830328 ],
        [-0.81795424, -0.81778592],
        [ 0.04373202,  0.04372377],
        [ 0.2482089 ,  0.24816171]]))

10.3 Poisson

rng = np.random.default_rng(4)
x = rng.normal(size=(300, 2))
mu = np.exp(0.2 + x @ np.array([0.5, -0.25]))
y = rng.poisson(mu).astype(float)

model = cm.Poisson(max_iterations=200, tolerance=1e-8)
model.fit(x, y)

vanilla = model.summary(vcov="vanilla")
sandwich = model.summary(vcov="sandwich")

vanilla["coef"], sandwich["coef_se"]
(array([ 0.51333285, -0.24653569]), array([0.04435521, 0.04865882]))

10.4 TwoSLS

rng = np.random.default_rng(6)
n = 500
z = rng.normal(size=(n, 3))
x_exog = rng.normal(size=(n, 1))
v = rng.normal(size=(n, 1))
eps = rng.normal(size=n)
clusters = np.repeat(np.arange(25, dtype=np.int64), n // 25)

x_endog = z @ np.array([[0.8], [0.35], [-0.25]]) + 0.2 * x_exog + v
y = 0.5 + 1.1 * x_endog[:, 0] - 0.6 * x_exog[:, 0] + (0.6 * v[:, 0] + 0.25 * eps)

model = cm.TwoSLS()
model.fit(x_endog, x_exog, z, y)

hc1 = model.summary()
cluster = model.summary(vcov="cluster", clusters=clusters)

hc1["coef"], cluster["coef_se"]
(array([ 1.09071698, -0.56326909]), array([0.03436151, 0.03780577]))

10.5 EPLM

rng = np.random.default_rng(7)
w = rng.normal(size=(300, 3))
d = 0.4 + w @ np.array([0.7, -0.4, 0.2]) + rng.normal(scale=0.8, size=300)
y = 1.3 * d + 0.8 + w @ np.array([0.3, -0.1, 0.2]) + rng.normal(scale=0.5, size=300)

model = cm.EPLM()
model.fit(y, d, w)

summary = model.summary()
summary["coef"], summary["se"]
(1.2751864601203131, 0.03939548509130182)

10.6 AverageDerivative

rng = np.random.default_rng(8)
w = rng.normal(size=(320, 2))
d = 0.2 + w @ np.array([0.6, -0.3]) + rng.normal(scale=0.6, size=320)
wc = w - w.mean(axis=0)
y = 1.0 + wc @ np.array([0.2, -0.3]) + d * (0.9 + wc @ np.array([0.15, -0.1])) + rng.normal(scale=0.4, size=320)

model = cm.AverageDerivative(method="dr")
model.fit(y, d, w)

summary = model.summary()
summary["method"], summary["coef"], summary["se"]
('dr', 0.9046195248445397, 0.03794115183465244)

10.7 PartiallyLinearDML

rng = np.random.default_rng(10)
x = rng.normal(size=(400, 4))
d = 0.3 + x @ np.array([0.5, -0.4, 0.2, 0.1]) + rng.normal(scale=0.8, size=400)
y = 1.4 * d + 1.0 + x @ np.array([0.4, -0.2, 0.1, 0.3]) + rng.normal(scale=0.6, size=400)

penalty_grid = np.logspace(-4, 2, 25)

model = cm.PartiallyLinearDML(penalty=penalty_grid, cv=5, n_folds=5, seed=1)
model.fit(y, d, x)

summary = model.summary()
summary["coef"], summary["se"], summary["outcome_penalties"][:2]
(1.4404211526609862, 0.042450917748467834, array([3.16227766, 1.77827941]))

10.8 AIPW

rng = np.random.default_rng(11)
x = rng.normal(size=(450, 3))
pi = 1.0 / (1.0 + np.exp(-(0.1 + x @ np.array([0.7, -0.4, 0.2]))))
d = rng.binomial(1, pi, size=450).astype(float)
y = 0.5 + x @ np.array([0.3, -0.2, 0.4]) + 1.1 * d + rng.normal(scale=0.8, size=450)

penalty_grid = np.logspace(-4, 2, 25)

model = cm.AIPW(penalty=penalty_grid, cv=5, n_folds=5, seed=1)
model.fit(y, d, x)

summary = model.summary()
summary["ate"], summary["se"], summary["propensity_penalties"][:2]
(1.1263630572688275, 0.08781046463340195, array([31.6227766, 31.6227766]))

10.9 GMM

def simple_iv_moments(theta, data):
    resid = data["y"] - data["x"] * theta[0]
    return data["z"] * resid[:, None]


def simple_iv_jacobian(theta, data):
    del theta
    return -(data["z"].T @ data["x"][:, None]) / data["x"].shape[0]


rng = np.random.default_rng(9)
n = 500
z = rng.normal(size=(n, 3))
v = rng.normal(size=n)
eps = rng.normal(size=n)
x = z @ np.array([0.9, 0.4, -0.3]) + v
y = 1.0 * x + 0.5 * v + 0.3 * eps

model = cm.GMM(simple_iv_moments, jacobian_fn=simple_iv_jacobian, max_iterations=200)
model.fit({"x": x, "y": y, "z": z}, np.array([0.0]), weighting="identity")

vanilla = model.summary(vcov="vanilla")
sandwich = model.summary(vcov="sandwich", omega="iid")

vanilla["coef"], sandwich["se"]
(array([0.95457291]), array([0.02801009]))

10.10 FixedEffectsOLS

This estimator follows the Frisch-Waugh-Lovell path directly: partial out the fixed effects with within, then regress the residualized outcome on the residualized regressors with no intercept.

rng = np.random.default_rng(12)
n = 400
x = rng.normal(size=(n, 2))
worker = rng.integers(0, 30, size=n, dtype=np.uint32)
firm = rng.integers(0, 15, size=n, dtype=np.uint32)
fe = np.column_stack([worker, firm]).astype(np.uint32)
beta = np.array([1.0, -0.6])
y = (
    x @ beta
    + rng.normal(scale=0.8, size=30)[worker]
    + rng.normal(scale=0.5, size=15)[firm]
    + rng.normal(scale=0.15, size=n)
)

model = cm.FixedEffectsOLS()
model.fit(x, fe, y)

cluster = model.summary(vcov="cluster", clusters=worker.astype(np.int64))
cluster
{'coef': array([ 1.01369715, -0.60410017]),
 'coef_se': array([0.00786918, 0.00722453]),
 'vcov_type': 'cluster'}

10.11 MEstimator

This is the lowest-level public interface in the library. The callback contract matters more than the optimizer choice:

  • objective_fn(theta, data) must return (objective_value, gradient_vector).
  • score_fn(theta, data) must return an (n_obs, n_params) matrix of per-observation scores.
  • For bootstrap support, the data dictionary must include n, and user code should respect an optional indices key.
import numpy as np


def ols_objective(theta, data):
    X = data["X"]
    y = data["y"]
    indices = data.get("indices", np.arange(len(y)))
    Xb = X[indices]
    yb = y[indices]
    residual = yb - Xb @ theta
    objective = 0.5 * np.sum(residual**2)
    gradient = -(Xb.T @ residual)
    return objective, gradient


def ols_scores(theta, data):
    X = data["X"]
    y = data["y"]
    residual = y - X @ theta
    return -X * residual[:, None]


rng = np.random.default_rng(7)
X = rng.normal(size=(300, 2))
y = X @ np.array([1.5, -0.4]) + rng.normal(scale=0.1, size=300)

model = cm.MEstimator(
    objective_fn=ols_objective,
    score_fn=ols_scores,
    max_iterations=200,
)
model.fit({"X": X, "y": y, "n": len(y)}, np.zeros(X.shape[1]))

model.summary()
{'coef': array([ 1.49784531, -0.39974307]),
 'se': array([0.60847816, 0.63437929])}

11 Current Sharp Edges

  • There is still no rich result object layer. Consumers should expect raw dictionaries and NumPy arrays.
  • Most estimators do not expose fit statistics or formula handling.
  • GMM is intentionally narrow: closed-form linear IV still lives in TwoSLS, and the generic class only does just-identified Gauss-Newton or two-step overidentified weighting.
  • MEstimator favors flexibility over guardrails. If your objective has fragile regions, add clipping or other stability controls inside the callback.