crabbymetrics API Reference

Library sitrep and verified Python surface

1 Overview

crabbymetrics is a compact Rust-backed econometrics library exposed to Python through pyo3 and packaged with maturin. The public API is intentionally small: estimator classes, transformer classes, an Optimizers helper namespace, numpy arrays as the only runtime dependency, and a scikit-adjacent fit / predict / summary pattern.

The page below combines two things:

A sitrep on the current state of the library.
A verified API reference rendered against the live Python module in this repository.

Show code

import inspect
from html import escape

import numpy as np
import crabbymetrics as cm
from IPython.display import HTML, display


def html_table(headers, rows, table_attrs=""):
    parts = [
        f"<table {table_attrs}>".strip(),
        "<thead>",
        "<tr>",
        *[f"<th>{escape(str(header))}</th>" for header in headers],
        "</tr>",
        "</thead>",
        "<tbody>",
    ]
    for row in rows:
        parts.append("<tr>")
        for cell in row:
            parts.append(f"<td>{cell}</td>")
        parts.append("</tr>")
    parts.extend(["</tbody>", "</table>"])
    return "".join(parts)

2 Sitrep

The implementation is concentrated in two Rust files: src/lib.rs registers the Python classes and src/estimators.rs holds nearly all estimator logic.
Shared linear algebra, covariance, bootstrap, and NumPy conversion helpers live in src/utils.rs.
Packaging is lean. The Python package declares only numpy at runtime, while native builds are handled by maturin.
Release automation already exists in .github/workflows/wheels.yml for Linux and macOS wheels across Python 3.10 to 3.14.
Coverage is broad for a small codebase: OLS, Ridge, fixed-effects OLS, synthetic control, balancing weights, EPLM, average-derivative estimators, partially linear DML, AIPW, ElasticNet, binary and multinomial logit, Poisson, TwoSLS, FTRL, a first-class callback-driven GMM, a lower-level MEstimator, plus PCA and KernelBasis as design-matrix preprocessors.
The module also exposes an Optimizers class with static methods for LBFGS, BFGS, nonlinear conjugate gradient, Gauss-Newton least squares, and simulated annealing.
The main maturity gaps are still API breadth and test depth. The public contract is now documented, but the estimator surface is intentionally narrow.
The public API is consistent, but not fully uniform. summary() always returns a plain Python dict, yet the exact keys vary by estimator.
TwoSLS now uses the closed-form linear IV / 2SLS estimator and accepts matrix-valued endogenous regressors and excluded instruments.
GMM now sits between the built-in estimators and MEstimator: users supply per-observation moments, optionally a sample Jacobian, and the class handles just-identified versus two-step fitting plus summary-time covariance choices.
MEstimator is still the lowest-level likelihood-style interface. Users are responsible for gradients, per-observation score matrices, and numerical safeguards inside their objective functions.

3 Public Surface

The table below is generated from the live module so the constructor and method signatures match the current build.

Show code

classes = [
    cm.OLS,
    cm.Ridge,
    cm.FixedEffectsOLS,
    cm.SyntheticControl,
    cm.BalancingWeights,
    cm.EPLM,
    cm.AverageDerivative,
    cm.PartiallyLinearDML,
    cm.AIPW,
    cm.ElasticNet,
    cm.Logit,
    cm.MultinomialLogit,
    cm.Poisson,
    cm.TwoSLS,
    cm.GMM,
    cm.FTRL,
    cm.MEstimator,
    cm.PCA,
    cm.KernelBasis,
    cm.Optimizers,
]

rows = []
for cls in classes:
    methods = []
    for method_name in sorted(name for name in dir(cls) if not name.startswith("_")):
        method = getattr(cls, method_name)
        if not callable(method):
            continue
        methods.append(
            f"<code>{escape(method_name)}{escape(str(inspect.signature(method)))}</code>"
        )
    rows.append(
        [
            f"<code>{escape(cls.__name__)}{escape(str(inspect.signature(cls)))}</code>",
            "<br>".join(methods),
        ]
    )

display(HTML(html_table(["Constructor", "Methods"], rows)))

Constructor	Methods
`OLS()`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x, y)` `fit_weighted(self, /, x, y, sample_weight)` `predict(self, /, x)` `summary(self, /, vcov='hc1', lags=None, clusters=None)`
`Ridge(penalty=None, cv=5)`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x, y)` `fit_weighted(self, /, x, y, sample_weight)` `predict(self, /, x)` `summary(self, /, vcov='hc1', lags=None, clusters=None)`
`FixedEffectsOLS()`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x, fe, y)` `fit_weighted(self, /, x, fe, y, sample_weight)` `summary(self, /, vcov='hc1', lags=None, clusters=None)`
`SyntheticControl(max_iterations=500)`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, donors, treated)` `predict(self, /, donors)` `summary(self, /)`
`BalancingWeights(objective='quadratic', solver='auto', autoscale=False, min_weight=0.0, max_weight=1.0, l2_norm=0.0, max_iterations=200, tolerance=1e-08)`	`fit(self, /, covariates, target_covariates, baseline_weights=None, target_weights=None)` `get_weights(self, /)` `summary(self, /)`
`EPLM(fd_eps=1e-06)`	`fit(self, /, y, d, w)` `summary(self, /, vcov=None, lags=None, clusters=None)`
`AverageDerivative(method='dr', fd_eps=1e-06)`	`fit(self, /, y, d, w)` `summary(self, /, vcov=None, lags=None, clusters=None)`
`PartiallyLinearDML(penalty=None, cv=5, n_folds=5, seed=42)`	`fit(self, /, y, d, x)` `summary(self, /, vcov=None, lags=None, clusters=None)`
`AIPW(penalty=None, cv=5, n_folds=5, propensity_clip=0.02, seed=42)`	`fit(self, /, y, d, x)` `summary(self, /, vcov=None, lags=None, clusters=None)`
`ElasticNet(penalty=1.0, l1_ratio=0.5, tolerance=0.0001, max_iterations=1000)`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x, y)` `predict(self, /, x)` `summary(self, /)`
`Logit(alpha=1.0, max_iterations=100, gradient_tolerance=0.0001)`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x, y)` `predict(self, /, x)` `summary(self, /)`
`MultinomialLogit(alpha=1.0, max_iterations=100, gradient_tolerance=0.0001)`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x, y)` `predict(self, /, x)` `summary(self, /)`
`Poisson(alpha=0.0, max_iterations=100, tolerance=0.0001)`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x, y)` `predict(self, /, x)` `summary(self, /, vcov='vanilla')`
`TwoSLS()`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x_endog, x_exog, z, y)` `fit_weighted(self, /, x_endog, x_exog, z, y, sample_weight)` `predict(self, /, x)` `summary(self, /, vcov='hc1', lags=None, clusters=None)`
`GMM(moment_fn, jacobian_fn=None, max_iterations=100, tolerance=1e-06, ridge=1e-08, fd_eps=1e-06)`	`fit(self, /, data, theta0, weighting='auto')` `summary(self, /, vcov='sandwich', omega='iid', lags=None, clusters=None)`
`FTRL(alpha=0.1, beta=1.0, l1_ratio=1.0, l2_ratio=1.0)`	`bootstrap(self, /, n_bootstrap, seed=None)` `fit(self, /, x, y)` `predict(self, /, x)` `summary(self, /)`
`MEstimator(objective_fn, score_fn, max_iterations=100, tolerance=1e-06)`	`bootstrap(self, /, n_bootstrap, seed=None)` `compute_vcov(self, /)` `fit(self, /, data, theta0)` `summary(self, /)`
`PCA(n_components, whiten=False)`	`fit(self, /, x)` `fit_transform(self, /, x)` `inverse_transform(self, /, scores)` `summary(self, /)` `transform(self, /, x)`
`KernelBasis(kernel='gaussian', bandwidth=0.5, coef0=1.0, degree=2.0)`	`fit(self, /, x)` `fit_transform(self, /, x)` `summary(self, /)` `transform(self, /, x)`
`Optimizers()`	`minimize_bfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)` `minimize_gauss_newton_ls(residual_fn, x0, jacobian_fn, max_iterations=100, tolerance=1e-06)` `minimize_lbfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)` `minimize_nonlinear_cg(fun, x0, grad, max_iterations=100, restart_iters=10, restart_orthogonality=0.1, tolerance=1e-06)` `minimize_simulated_annealing(fun, x0, lower=None, upper=None, temp=15.0, step_size=0.1, max_iterations=5000, seed=None)`

4 Optimizer Surface

The optimizer helpers live under cm.Optimizers as static methods. They mirror the usual scipy pattern: pass a Python callback, a starting value, and any derivative information the method needs, then receive a plain dictionary with x, fun, nit, success, message, and method.

Show code

optimizer_rows = []
for name in sorted(n for n in dir(cm.Optimizers) if n.startswith("minimize_")):
    fn = getattr(cm.Optimizers, name)
    optimizer_rows.append(
        [f"<code>Optimizers.{escape(name)}{escape(str(inspect.signature(fn)))}</code>"]
    )

display(HTML(html_table(["Function"], optimizer_rows)))

Function
`Optimizers.minimize_bfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)`
`Optimizers.minimize_gauss_newton_ls(residual_fn, x0, jacobian_fn, max_iterations=100, tolerance=1e-06)`
`Optimizers.minimize_lbfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)`
`Optimizers.minimize_nonlinear_cg(fun, x0, grad, max_iterations=100, restart_iters=10, restart_orthogonality=0.1, tolerance=1e-06)`
`Optimizers.minimize_simulated_annealing(fun, x0, lower=None, upper=None, temp=15.0, step_size=0.1, max_iterations=5000, seed=None)`

5 Conventions

Input features are always 2D NumPy arrays.
Regression targets use float64 arrays.
PCA and KernelBasis fit on x alone and return transformed feature matrices that can be passed into the regression estimators.
The Optimizers methods return plain dictionaries rather than scipy result objects, but the schema is intentionally similar.
FixedEffectsOLS.fit(x, fe, y) expects fe to be a 2D uint32 matrix of zero-based category codes, one column per factor.
SyntheticControl.fit(donors, treated) expects donor outcomes in an (n_periods, n_donors) matrix and a treated pre-period outcome vector of length n_periods.
BalancingWeights.fit(covariates, target_covariates, baseline_weights=None, target_weights=None) estimates calibration weights on the first matrix so its weighted covariate mean matches the target mean implied by the second matrix.
EPLM.fit(y, d, w) and AverageDerivative.fit(y, d, w) currently target a scalar continuous treatment and a 2D control matrix.
PartiallyLinearDML.fit(y, d, x) targets a scalar continuous treatment with cross-fit ridge nuisances.
AIPW.fit(y, d, x) targets a binary treatment coded as 0/1 and estimates the ATE with cross-fit ridge nuisance models.
Logit, MultinomialLogit, and FTRL expect integer class labels at fit time.
Most estimators store their training data internally so summary() and bootstrap() can be called after fit().
OLS, Ridge, ElasticNet, Logit, MultinomialLogit, Poisson, and TwoSLS now always fit an intercept.
FixedEffectsOLS, EPLM, AverageDerivative, PartiallyLinearDML, AIPW, GMM, and MEstimator deliberately have no predict() method. They are estimation-first interfaces.
When an estimator has an intercept, bootstrap draws place it in the first column.
summary() returns dictionaries, not rich result objects. Downstream code should index by key.
bootstrap(n_bootstrap, seed=None) now accepts an omitted seed across the full public surface.
OLS, Ridge, FixedEffectsOLS, and TwoSLS now share the same robust-inference surface: summary(vcov="vanilla" | "hc1" | "newey_west" | "cluster", lags=None, clusters=None).

6 Estimator Reference

Show code

reference_rows = [
    [
        "<code>OLS</code>",
        "<code>fit(x, y)</code>",
        "1D float predictions",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>",
        "Uses Linfa linear regression; <code>summary(vcov='hc1')</code> is default, and the linear-summary surface also accepts <code>'vanilla'</code>, <code>'newey_west'</code>, and <code>'cluster'</code>.",
    ],
    [
        "<code>Ridge</code>",
        "<code>fit(x, y)</code>",
        "1D float predictions",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>, <code>penalty</code>, <code>penalties</code>, <code>best_penalty_index</code>, <code>cv_mse</code>, <code>intercept_path</code>, <code>coef_path</code>",
        "<code>(B, p + 1)</code> with intercept",
        "Closed-form L2-regularized least squares solved through an augmented QR least-squares system. Passing a penalty grid runs cross-validation, stores the full coefficient path, refits the selected penalty on the full sample, and exposes the same linear-summary covariance options as OLS.",
    ],
    [
        "<code>FixedEffectsOLS</code>",
        "<code>fit(x, fe_uint32, y)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>coef_se</code>",
        "<code>(B, p)</code>",
        "Uses <code>within</code> to partial out fixed effects, then Linfa OLS with <code>fit_intercept=False</code> on the residualized design. The summary covariance options match the other linear estimators.",
    ],
    [
        "<code>SyntheticControl</code>",
        "<code>fit(donors, treated)</code>",
        "1D float synthetic path from donor outcomes",
        "<code>weights</code>, <code>pre_rmse</code>",
        "<code>(B, n_donors)</code>",
        "Fits nonnegative donor weights that sum to one by minimizing pre-treatment squared error under a simplex constraint.",
    ],
    [
        "<code>BalancingWeights</code>",
        "<code>fit(covariates, target_covariates, baseline_weights=None, target_weights=None)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>weights</code>, <code>beta</code>, <code>weighted_mean</code>, <code>target_mean</code>, <code>mean_diff</code>, <code>l2_diff</code>, <code>effective_sample_size</code>, <code>success</code>",
        "No bootstrap method",
        "Dual calibration solver for entropy and quadratic balancing weights. Intended for ATT-style reweighting, covariate shift, and domain adaptation tasks where one sample is reweighted to match another sample's covariate mean.",
    ],
    [
        "<code>EPLM</code>",
        "<code>fit(y, d, w)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>nuisance_coef</code>",
        "No bootstrap method",
        "Robins-Newey style partially linear E-estimator implemented as an exactly identified stacked-moment system with a linear working model for <code>E[D|W]</code>.",
    ],
    [
        "<code>AverageDerivative</code>",
        "<code>fit(y, d, w)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>method</code>, <code>coef</code>, <code>se</code>, <code>vcov</code>",
        "No bootstrap method",
        "Graham-Pinto style average-derivative estimator with <code>method='ob' | 'ipw' | 'dr'</code> under a scalar continuous-treatment working model.",
    ],
    [
        "<code>PartiallyLinearDML</code>",
        "<code>fit(y, d, x)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>outcome_penalties</code>, <code>treatment_penalties</code>",
        "No bootstrap method",
        "Cross-fit ridge implementation of the partially linear Double-ML score. The nuisance fits are selected fold by fold from a common ridge penalty grid.",
    ],
    [
        "<code>AIPW</code>",
        "<code>fit(y, d, x)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>ate</code>, <code>se</code>, <code>vcov</code>, <code>outcome0_penalties</code>, <code>outcome1_penalties</code>, <code>propensity_penalties</code>",
        "No bootstrap method",
        "Cross-fit ridge AIPW estimator for a binary-treatment ATE. Outcome regressions are fit separately by treatment arm and the ridge propensity fit is clipped away from 0 and 1.",
    ],
    [
        "<code>ElasticNet</code>",
        "<code>fit(x, y)</code>",
        "1D float predictions",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>",
        "Regularized point estimates with HC1-style covariance built from residuals.",
    ],
    [
        "<code>Logit</code>",
        "<code>fit(x, y_int32)</code>",
        "1D <code>int32</code> class labels",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>",
        "Binary logistic regression. Standard errors come from the Fisher information.",
    ],
    [
        "<code>MultinomialLogit</code>",
        "<code>fit(x, y_int32)</code>",
        "1D <code>int32</code> class labels",
        "<code>coef</code> and <code>se</code> matrices of shape <code>(classes, features_with_intercept)</code>",
        "<code>(B, classes * features_with_intercept)</code>",
        "Intercept and slopes are packed together by class in the summary matrix.",
    ],
    [
        "<code>Poisson</code>",
        "<code>fit(x, y)</code>",
        "1D float mean counts",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>",
        "Custom Newton-CG Poisson MLE with optional L2 penalty via <code>alpha</code>; <code>summary(vcov='vanilla')</code> gives Fisher SEs and <code>summary(vcov='sandwich')</code> gives QMLE sandwich SEs.",
    ],
    [
        "<code>TwoSLS</code>",
        "<code>fit(x_endog, x_exog, z, y)</code>",
        "1D float predictions from the stage-two design matrix",
        "<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>",
        "<code>(B, k + 1)</code> with intercept, else <code>(B, k)</code>",
        "Closed-form linear IV / 2SLS estimator. Supports multiple endogenous regressors, and its summary method now shares the same <code>vanilla</code>, <code>hc1</code>, <code>newey_west</code>, and <code>cluster</code> covariance interface as the other linear estimators.",
    ],
    [
        "<code>FTRL</code>",
        "<code>fit(x, y_int32)</code>",
        "1D float scores in <code>[0, 1]</code>",
        "<code>coef</code>, <code>coef_se</code>",
        "<code>(B, p)</code>",
        "Online-style classification model with a different summary schema and no explicit intercept output.",
    ],
    [
        "<code>GMM</code>",
        "<code>fit(data_dict, theta0, weighting='auto')</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>criterion</code>, <code>nit</code>, <code>weighting</code>, <code>vcov_type</code>, <code>omega_type</code>, <code>weight_matrix</code>, <code>nobs</code>, <code>n_moments</code>, <code>j_stat</code>, <code>j_df</code>",
        "No bootstrap method",
        "Requires <code>moment_fn(theta, data)</code> returning an <code>(n, m)</code> matrix of per-observation moments and optionally <code>jacobian_fn(theta, data)</code> returning the sample Jacobian of shape <code>(m, p)</code>.",
    ],
    [
        "<code>MEstimator</code>",
        "<code>fit(data_dict, theta0)</code>",
        "No dedicated <code>predict()</code> method",
        "<code>coef</code>, <code>se</code>",
        "<code>(B, p)</code>",
        "Requires <code>objective_fn(theta, data) -&gt; (obj, grad)</code> and <code>score_fn(theta, data)</code> returning an <code>(n, p)</code> matrix.",
    ],
]

display(
    HTML(
        "<div style='width: 100%; overflow-x: auto;'>"
        + html_table(
            ["Estimator", "Fit", "Predict", "Summary Keys", "Bootstrap", "Notes"],
            reference_rows,
            table_attrs="style='width: 100%; min-width: 100%; table-layout: auto;'",
        )
        + "</div>"
    )
)

Estimator	Fit	Predict	Summary Keys	Bootstrap	Notes
`OLS`	`fit(x, y)`	1D float predictions	`intercept`, `coef`, `intercept_se`, `coef_se`	`(B, p + 1)` with intercept, else `(B, p)`	Uses Linfa linear regression; `summary(vcov='hc1')` is default, and the linear-summary surface also accepts `'vanilla'`, `'newey_west'`, and `'cluster'`.
`Ridge`	`fit(x, y)`	1D float predictions	`intercept`, `coef`, `intercept_se`, `coef_se`, `penalty`, `penalties`, `best_penalty_index`, `cv_mse`, `intercept_path`, `coef_path`	`(B, p + 1)` with intercept	Closed-form L2-regularized least squares solved through an augmented QR least-squares system. Passing a penalty grid runs cross-validation, stores the full coefficient path, refits the selected penalty on the full sample, and exposes the same linear-summary covariance options as OLS.
`FixedEffectsOLS`	`fit(x, fe_uint32, y)`	No dedicated `predict()` method	`coef`, `coef_se`	`(B, p)`	Uses `within` to partial out fixed effects, then Linfa OLS with `fit_intercept=False` on the residualized design. The summary covariance options match the other linear estimators.
`SyntheticControl`	`fit(donors, treated)`	1D float synthetic path from donor outcomes	`weights`, `pre_rmse`	`(B, n_donors)`	Fits nonnegative donor weights that sum to one by minimizing pre-treatment squared error under a simplex constraint.
`BalancingWeights`	`fit(covariates, target_covariates, baseline_weights=None, target_weights=None)`	No dedicated `predict()` method	`weights`, `beta`, `weighted_mean`, `target_mean`, `mean_diff`, `l2_diff`, `effective_sample_size`, `success`	No bootstrap method	Dual calibration solver for entropy and quadratic balancing weights. Intended for ATT-style reweighting, covariate shift, and domain adaptation tasks where one sample is reweighted to match another sample's covariate mean.
`EPLM`	`fit(y, d, w)`	No dedicated `predict()` method	`coef`, `se`, `vcov`, `nuisance_coef`	No bootstrap method	Robins-Newey style partially linear E-estimator implemented as an exactly identified stacked-moment system with a linear working model for `E[D\|W]`.
`AverageDerivative`	`fit(y, d, w)`	No dedicated `predict()` method	`method`, `coef`, `se`, `vcov`	No bootstrap method	Graham-Pinto style average-derivative estimator with `method='ob' \| 'ipw' \| 'dr'` under a scalar continuous-treatment working model.
`PartiallyLinearDML`	`fit(y, d, x)`	No dedicated `predict()` method	`coef`, `se`, `vcov`, `outcome_penalties`, `treatment_penalties`	No bootstrap method	Cross-fit ridge implementation of the partially linear Double-ML score. The nuisance fits are selected fold by fold from a common ridge penalty grid.
`AIPW`	`fit(y, d, x)`	No dedicated `predict()` method	`ate`, `se`, `vcov`, `outcome0_penalties`, `outcome1_penalties`, `propensity_penalties`	No bootstrap method	Cross-fit ridge AIPW estimator for a binary-treatment ATE. Outcome regressions are fit separately by treatment arm and the ridge propensity fit is clipped away from 0 and 1.
`ElasticNet`	`fit(x, y)`	1D float predictions	`intercept`, `coef`, `intercept_se`, `coef_se`	`(B, p + 1)` with intercept, else `(B, p)`	Regularized point estimates with HC1-style covariance built from residuals.
`Logit`	`fit(x, y_int32)`	1D `int32` class labels	`intercept`, `coef`, `intercept_se`, `coef_se`	`(B, p + 1)` with intercept, else `(B, p)`	Binary logistic regression. Standard errors come from the Fisher information.
`MultinomialLogit`	`fit(x, y_int32)`	1D `int32` class labels	`coef` and `se` matrices of shape `(classes, features_with_intercept)`	`(B, classes * features_with_intercept)`	Intercept and slopes are packed together by class in the summary matrix.
`Poisson`	`fit(x, y)`	1D float mean counts	`intercept`, `coef`, `intercept_se`, `coef_se`	`(B, p + 1)` with intercept, else `(B, p)`	Custom Newton-CG Poisson MLE with optional L2 penalty via `alpha`; `summary(vcov='vanilla')` gives Fisher SEs and `summary(vcov='sandwich')` gives QMLE sandwich SEs.
`TwoSLS`	`fit(x_endog, x_exog, z, y)`	1D float predictions from the stage-two design matrix	`intercept`, `coef`, `intercept_se`, `coef_se`	`(B, k + 1)` with intercept, else `(B, k)`	Closed-form linear IV / 2SLS estimator. Supports multiple endogenous regressors, and its summary method now shares the same `vanilla`, `hc1`, `newey_west`, and `cluster` covariance interface as the other linear estimators.
`FTRL`	`fit(x, y_int32)`	1D float scores in `[0, 1]`	`coef`, `coef_se`	`(B, p)`	Online-style classification model with a different summary schema and no explicit intercept output.
`GMM`	`fit(data_dict, theta0, weighting='auto')`	No dedicated `predict()` method	`coef`, `se`, `vcov`, `criterion`, `nit`, `weighting`, `vcov_type`, `omega_type`, `weight_matrix`, `nobs`, `n_moments`, `j_stat`, `j_df`	No bootstrap method	Requires `moment_fn(theta, data)` returning an `(n, m)` matrix of per-observation moments and optionally `jacobian_fn(theta, data)` returning the sample Jacobian of shape `(m, p)`.
`MEstimator`	`fit(data_dict, theta0)`	No dedicated `predict()` method	`coef`, `se`	`(B, p)`	Requires `objective_fn(theta, data) -> (obj, grad)` and `score_fn(theta, data)` returning an `(n, p)` matrix.

7 Transformer Reference

Show code

transformer_rows = [
    [
        "<code>PCA</code>",
        "<code>fit(x)</code>",
        "<code>transform(x)</code>, <code>fit_transform(x)</code>, <code>inverse_transform(scores)</code>",
        "<code>n_components</code>, <code>n_features</code>, <code>whiten</code>, <code>components</code>, <code>mean</code>, <code>explained_variance</code>, <code>explained_variance_ratio</code>, <code>singular_values</code>",
        "Principal-components basis estimated from <code>x</code> alone. Useful for principal-components regression and other low-rank pipelines.",
    ],
    [
        "<code>KernelBasis</code>",
        "<code>fit(x)</code>",
        "<code>transform(x)</code>, <code>fit_transform(x)</code>",
        "<code>kernel</code>, <code>n_train</code>, <code>n_features</code>, <code>bandwidth</code>, <code>coef0</code>, <code>degree</code>, <code>diagonal</code>",
        "Stores the fitted training design and returns kernel features against that basis, so downstream regressions can work on nonlinear right-hand sides.",
    ],
]

display(
    HTML(
        html_table(
            ["Transformer", "Fit", "Transform", "Summary Keys", "Notes"],
            transformer_rows,
        )
    )
)

Transformer	Fit	Transform	Summary Keys	Notes
`PCA`	`fit(x)`	`transform(x)`, `fit_transform(x)`, `inverse_transform(scores)`	`n_components`, `n_features`, `whiten`, `components`, `mean`, `explained_variance`, `explained_variance_ratio`, `singular_values`	Principal-components basis estimated from `x` alone. Useful for principal-components regression and other low-rank pipelines.
`KernelBasis`	`fit(x)`	`transform(x)`, `fit_transform(x)`	`kernel`, `n_train`, `n_features`, `bandwidth`, `coef0`, `degree`, `diagonal`	Stores the fitted training design and returns kernel features against that basis, so downstream regressions can work on nonlinear right-hand sides.

8 Optimizer Reference

Show code

optimizer_reference_rows = [
    [
        "<code>Optimizers.minimize_lbfgs</code>",
        "<code>fun(theta) -> float</code>, <code>grad(theta) -> (p,)</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Limited-memory quasi-Newton method for smooth unconstrained objectives.",
    ],
    [
        "<code>Optimizers.minimize_bfgs</code>",
        "<code>fun(theta) -> float</code>, <code>grad(theta) -> (p,)</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Dense quasi-Newton method with an explicit inverse-Hessian approximation.",
    ],
    [
        "<code>Optimizers.minimize_nonlinear_cg</code>",
        "<code>fun(theta) -> float</code>, <code>grad(theta) -> (p,)</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Conjugate-gradient optimizer with Armijo backtracking and a gradient-norm success check.",
    ],
    [
        "<code>Optimizers.minimize_gauss_newton_ls</code>",
        "<code>residual(theta) -> (n,)</code>, <code>jacobian(theta) -> (n, p)</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Least-squares Gauss-Newton routine for moment or residual systems.",
    ],
    [
        "<code>Optimizers.minimize_simulated_annealing</code>",
        "<code>fun(theta) -> float</code>",
        "<code>x</code>, <code>fun</code>, <code>nit</code>, <code>success</code>, <code>message</code>, <code>method</code>",
        "Stochastic global optimizer with optional box bounds, temperature, and RNG seed.",
    ],
]

display(
    HTML(
        html_table(
            ["Function", "Required Callbacks", "Returns", "Notes"],
            optimizer_reference_rows,
        )
    )
)

Function	Required Callbacks	Returns	Notes
`Optimizers.minimize_lbfgs`	`fun(theta) -> float`, `grad(theta) -> (p,)`	`x`, `fun`, `nit`, `success`, `message`, `method`	Limited-memory quasi-Newton method for smooth unconstrained objectives.
`Optimizers.minimize_bfgs`	`fun(theta) -> float`, `grad(theta) -> (p,)`	`x`, `fun`, `nit`, `success`, `message`, `method`	Dense quasi-Newton method with an explicit inverse-Hessian approximation.
`Optimizers.minimize_nonlinear_cg`	`fun(theta) -> float`, `grad(theta) -> (p,)`	`x`, `fun`, `nit`, `success`, `message`, `method`	Conjugate-gradient optimizer with Armijo backtracking and a gradient-norm success check.
`Optimizers.minimize_gauss_newton_ls`	`residual(theta) -> (n,)`, `jacobian(theta) -> (n, p)`	`x`, `fun`, `nit`, `success`, `message`, `method`	Least-squares Gauss-Newton routine for moment or residual systems.
`Optimizers.minimize_simulated_annealing`	`fun(theta) -> float`	`x`, `fun`, `nit`, `success`, `message`, `method`	Stochastic global optimizer with optional box bounds, temperature, and RNG seed.

9 Verified Runtime Snapshot

The next chunk fits each estimator on synthetic data and records the key summary schema and bootstrap output shape. This acts as a lightweight executable check that the reference above matches the current build.

Show code

rng = np.random.default_rng(123)
runtime_rows = []

# OLS
x = rng.normal(size=(40, 3))
y = 0.5 + x @ np.array([1.0, -2.0, 0.3]) + rng.normal(scale=0.2, size=40)
model = cm.OLS()
model.fit(x, y)
runtime_rows.append(
    [
        "<code>OLS</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3).shape),
    ]
)

# Ridge
model = cm.Ridge(penalty=np.array([0.0, 0.1, 1.0]), cv=3)
model.fit(x, y)
runtime_rows.append(
    [
        "<code>Ridge</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# FixedEffectsOLS
n = 120
x = rng.normal(size=(n, 2))
worker = rng.integers(0, 20, size=n, dtype=np.uint32)
firm = rng.integers(0, 12, size=n, dtype=np.uint32)
fe = np.column_stack([worker, firm]).astype(np.uint32)
y = (
    x @ np.array([1.1, -0.4])
    + rng.normal(size=20)[worker]
    + rng.normal(size=12)[firm]
    + rng.normal(scale=0.2, size=n)
)
model = cm.FixedEffectsOLS()
model.fit(x, fe, y)
runtime_rows.append(
    [
        "<code>FixedEffectsOLS</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# ElasticNet
model = cm.ElasticNet(penalty=0.1, l1_ratio=0.5)
model.fit(x, y)
runtime_rows.append(
    [
        "<code>ElasticNet</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# SyntheticControl
donors_pre = rng.normal(size=(50, 3))
weights_true = np.array([0.5, 0.35, 0.15])
treated_pre = donors_pre @ weights_true
model = cm.SyntheticControl(max_iterations=300)
model.fit(donors_pre, treated_pre)
runtime_rows.append(
    [
        "<code>SyntheticControl</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# Logit
x = rng.normal(size=(60, 3))
eta = -0.2 + x @ np.array([0.8, -0.5, 1.2])
prob = 1.0 / (1.0 + np.exp(-eta))
y_bin = rng.binomial(1, prob, size=60).astype(np.int32)
model = cm.Logit(max_iterations=200)
model.fit(x, y_bin)
runtime_rows.append(
    [
        "<code>Logit</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# MultinomialLogit
x = rng.normal(size=(80, 2))
coef = np.array([[0.5, -0.7], [-0.3, 0.4], [0.2, 0.6]])
intercept = np.array([0.2, -0.1, 0.0])
logits = x @ coef.T + intercept
weights = np.exp(logits - logits.max(axis=1, keepdims=True))
prob = weights / weights.sum(axis=1, keepdims=True)
y_multi = np.array([rng.choice(3, p=row) for row in prob], dtype=np.int32)
model = cm.MultinomialLogit(max_iterations=200)
model.fit(x, y_multi)
runtime_rows.append(
    [
        "<code>MultinomialLogit</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# Poisson
x = rng.normal(size=(70, 2))
mu = np.exp(0.1 + x @ np.array([0.4, -0.2]))
y_pois = rng.poisson(mu).astype(float)
model = cm.Poisson(max_iterations=200)
model.fit(x, y_pois)
runtime_rows.append(
    [
        "<code>Poisson</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# TwoSLS
n = 80
z = rng.normal(size=(n, 2))
x_exog = rng.normal(size=(n, 1))
v = rng.normal(size=(n, 2))
u = 0.5 * v[:, 0] - 0.35 * v[:, 1] + rng.normal(scale=0.1, size=n)
x_endog = z @ np.array([[0.9, 0.1], [-0.25, 0.8]]) + 0.2 * x_exog + v
y_iv = 0.4 + x_endog @ np.array([1.3, -0.7]) - 0.6 * x_exog[:, 0] + u
model = cm.TwoSLS()
model.fit(x_endog, x_exog, z, y_iv)
runtime_rows.append(
    [
        "<code>TwoSLS</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# FTRL
x = rng.normal(size=(80, 4))
eta = x @ np.array([0.7, -0.4, 0.2, 0.3])
prob = 1.0 / (1.0 + np.exp(-eta))
y_ftrl = rng.binomial(1, prob, size=80).astype(np.int32)
model = cm.FTRL()
model.fit(x, y_ftrl)
runtime_rows.append(
    [
        "<code>FTRL</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

# GMM
def gmm_moments(theta, data):
    X = data["X"]
    y = data["y"]
    centered = X[:, 0] - theta[0]
    resid = y - theta[1] - theta[2] * centered
    return np.column_stack([X[:, 0] - theta[0], resid, centered * resid])


def gmm_jacobian(theta, data):
    X = data["X"]
    y = data["y"]
    centered = X[:, 0] - theta[0]
    resid = y - theta[1] - theta[2] * centered
    return np.array(
        [
            [-1.0, 0.0, 0.0],
            [theta[2], -1.0, -centered.mean()],
            [(-resid + theta[2] * centered).mean(), -centered.mean(), -(centered**2).mean()],
        ]
    )


x = rng.normal(size=(60, 1))
y = 1.2 + 0.8 * (x[:, 0] - x[:, 0].mean()) + rng.normal(scale=0.2, size=60)
model = cm.GMM(gmm_moments, jacobian_fn=gmm_jacobian, max_iterations=200)
model.fit({"X": x, "y": y}, np.zeros(3))
runtime_rows.append(
    [
        "<code>GMM</code>",
        ", ".join(model.summary().keys()),
        "n/a",
    ]
)

# MEstimator
def ols_objective(theta, data):
    X = data["X"]
    y = data["y"]
    indices = data.get("indices", np.arange(len(y)))
    Xb = X[indices]
    yb = y[indices]
    residual = yb - Xb @ theta
    return 0.5 * np.sum(residual**2), -(Xb.T @ residual)


def ols_scores(theta, data):
    X = data["X"]
    y = data["y"]
    residual = y - X @ theta
    return -X * residual[:, None]


x = rng.normal(size=(50, 3))
y = x @ np.array([1.0, -0.5, 0.2]) + rng.normal(scale=0.2, size=50)
model = cm.MEstimator(ols_objective, ols_scores, max_iterations=200)
model.fit({"X": x, "y": y, "n": len(y)}, np.zeros(x.shape[1]))
runtime_rows.append(
    [
        "<code>MEstimator</code>",
        ", ".join(model.summary().keys()),
        str(model.bootstrap(3, seed=1).shape),
    ]
)

display(HTML(html_table(["Estimator", "Summary Keys", "Bootstrap Shape"], runtime_rows)))

Estimator	Summary Keys	Bootstrap Shape
`OLS`	intercept, coef, intercept_se, coef_se, vcov_type	(3, 4)
`Ridge`	intercept, coef, intercept_se, coef_se, penalty, penalties, vcov_type, best_penalty_index, cv_mse, intercept_path, coef_path	(3, 4)
`FixedEffectsOLS`	coef, coef_se, vcov_type	(3, 2)
`ElasticNet`	intercept, coef, intercept_se, coef_se	(3, 3)
`SyntheticControl`	weights, pre_rmse	(3, 3)
`Logit`	intercept, coef, intercept_se, coef_se	(3, 4)
`MultinomialLogit`	coef, se	(3, 9)
`Poisson`	intercept, coef, intercept_se, coef_se, vcov_type	(3, 3)
`TwoSLS`	intercept, coef, intercept_se, coef_se, vcov_type	(3, 4)
`FTRL`	coef, coef_se	(3, 4)
`GMM`	coef, se, vcov, criterion, nit, weighting, vcov_type, omega_type, weight_matrix, nobs, n_moments, j_stat, j_df	n/a
`MEstimator`	coef, se	(3, 3)

10 Usage Examples

10.1 OLS

rng = np.random.default_rng(0)
x = rng.normal(size=(200, 3))
beta = np.array([1.2, -0.7, 0.5])
y = 0.4 + x @ beta + rng.normal(scale=0.2, size=200)

model = cm.OLS()
model.fit(x, y)

summary = model.summary()
summary_vanilla = model.summary(vcov="vanilla")
summary_hac = model.summary(vcov="newey_west", lags=4)
clusters = np.repeat(np.arange(20, dtype=np.int64), len(y) // 20)
summary_cluster = model.summary(vcov="cluster", clusters=clusters)
bootstrap_draws = model.bootstrap(5, seed=42)

summary["coef"], summary_vanilla["coef_se"], summary_hac["coef_se"], summary_cluster["coef_se"]

(array([ 1.18200348, -0.71923052,  0.51060512]),
 array([0.01415748, 0.01431653, 0.01472891]),
 array([0.01376542, 0.01521404, 0.01344638]),
 array([0.01486123, 0.01548593, 0.01560255]))

10.2 Ridge

rng = np.random.default_rng(2)
x = rng.normal(size=(250, 4))
beta = np.array([1.0, -0.8, 0.0, 0.3])
y = 0.6 + x @ beta + rng.normal(scale=0.4, size=250)

model = cm.Ridge(penalty=np.array([0.0, 0.05, 0.2, 1.0]), cv=5)
model.fit(x, y)

summary = model.summary()
summary_hac = model.summary(vcov="newey_west", lags=4)

summary["penalty"], summary["best_penalty_index"], summary_hac["coef_se"], summary["coef_path"][:, :2]

(0.0,
 0,
 array([0.01932908, 0.02579194, 0.02352418, 0.02311446]),
 array([[ 0.98320114,  0.9830328 ],
        [-0.81795424, -0.81778592],
        [ 0.04373202,  0.04372377],
        [ 0.2482089 ,  0.24816171]]))

10.3 Poisson

rng = np.random.default_rng(4)
x = rng.normal(size=(300, 2))
mu = np.exp(0.2 + x @ np.array([0.5, -0.25]))
y = rng.poisson(mu).astype(float)

model = cm.Poisson(max_iterations=200, tolerance=1e-8)
model.fit(x, y)

vanilla = model.summary(vcov="vanilla")
sandwich = model.summary(vcov="sandwich")

vanilla["coef"], sandwich["coef_se"]

(array([ 0.51333285, -0.24653569]), array([0.04435521, 0.04865882]))

10.4 TwoSLS

rng = np.random.default_rng(6)
n = 500
z = rng.normal(size=(n, 3))
x_exog = rng.normal(size=(n, 1))
v = rng.normal(size=(n, 1))
eps = rng.normal(size=n)
clusters = np.repeat(np.arange(25, dtype=np.int64), n // 25)

x_endog = z @ np.array([[0.8], [0.35], [-0.25]]) + 0.2 * x_exog + v
y = 0.5 + 1.1 * x_endog[:, 0] - 0.6 * x_exog[:, 0] + (0.6 * v[:, 0] + 0.25 * eps)

model = cm.TwoSLS()
model.fit(x_endog, x_exog, z, y)

hc1 = model.summary()
cluster = model.summary(vcov="cluster", clusters=clusters)

hc1["coef"], cluster["coef_se"]

(array([ 1.09071698, -0.56326909]), array([0.03436151, 0.03780577]))

10.5 EPLM

rng = np.random.default_rng(7)
w = rng.normal(size=(300, 3))
d = 0.4 + w @ np.array([0.7, -0.4, 0.2]) + rng.normal(scale=0.8, size=300)
y = 1.3 * d + 0.8 + w @ np.array([0.3, -0.1, 0.2]) + rng.normal(scale=0.5, size=300)

model = cm.EPLM()
model.fit(y, d, w)

summary = model.summary()
summary["coef"], summary["se"]

(1.2751864601203131, 0.03939548509130182)

10.6 AverageDerivative

rng = np.random.default_rng(8)
w = rng.normal(size=(320, 2))
d = 0.2 + w @ np.array([0.6, -0.3]) + rng.normal(scale=0.6, size=320)
wc = w - w.mean(axis=0)
y = 1.0 + wc @ np.array([0.2, -0.3]) + d * (0.9 + wc @ np.array([0.15, -0.1])) + rng.normal(scale=0.4, size=320)

model = cm.AverageDerivative(method="dr")
model.fit(y, d, w)

summary = model.summary()
summary["method"], summary["coef"], summary["se"]

('dr', 0.9046195248445397, 0.03794115183465244)

10.7 PartiallyLinearDML

rng = np.random.default_rng(10)
x = rng.normal(size=(400, 4))
d = 0.3 + x @ np.array([0.5, -0.4, 0.2, 0.1]) + rng.normal(scale=0.8, size=400)
y = 1.4 * d + 1.0 + x @ np.array([0.4, -0.2, 0.1, 0.3]) + rng.normal(scale=0.6, size=400)

penalty_grid = np.logspace(-4, 2, 25)

model = cm.PartiallyLinearDML(penalty=penalty_grid, cv=5, n_folds=5, seed=1)
model.fit(y, d, x)

summary = model.summary()
summary["coef"], summary["se"], summary["outcome_penalties"][:2]

(1.4404211526609862, 0.042450917748467834, array([3.16227766, 1.77827941]))

10.8 AIPW

rng = np.random.default_rng(11)
x = rng.normal(size=(450, 3))
pi = 1.0 / (1.0 + np.exp(-(0.1 + x @ np.array([0.7, -0.4, 0.2]))))
d = rng.binomial(1, pi, size=450).astype(float)
y = 0.5 + x @ np.array([0.3, -0.2, 0.4]) + 1.1 * d + rng.normal(scale=0.8, size=450)

penalty_grid = np.logspace(-4, 2, 25)

model = cm.AIPW(penalty=penalty_grid, cv=5, n_folds=5, seed=1)
model.fit(y, d, x)

summary = model.summary()
summary["ate"], summary["se"], summary["propensity_penalties"][:2]

(1.1263630572688275, 0.08781046463340195, array([31.6227766, 31.6227766]))

10.9 GMM

def simple_iv_moments(theta, data):
    resid = data["y"] - data["x"] * theta[0]
    return data["z"] * resid[:, None]


def simple_iv_jacobian(theta, data):
    del theta
    return -(data["z"].T @ data["x"][:, None]) / data["x"].shape[0]


rng = np.random.default_rng(9)
n = 500
z = rng.normal(size=(n, 3))
v = rng.normal(size=n)
eps = rng.normal(size=n)
x = z @ np.array([0.9, 0.4, -0.3]) + v
y = 1.0 * x + 0.5 * v + 0.3 * eps

model = cm.GMM(simple_iv_moments, jacobian_fn=simple_iv_jacobian, max_iterations=200)
model.fit({"x": x, "y": y, "z": z}, np.array([0.0]), weighting="identity")

vanilla = model.summary(vcov="vanilla")
sandwich = model.summary(vcov="sandwich", omega="iid")

vanilla["coef"], sandwich["se"]

(array([0.95457291]), array([0.02801009]))

10.10 FixedEffectsOLS

This estimator follows the Frisch-Waugh-Lovell path directly: partial out the fixed effects with within, then regress the residualized outcome on the residualized regressors with no intercept.

rng = np.random.default_rng(12)
n = 400
x = rng.normal(size=(n, 2))
worker = rng.integers(0, 30, size=n, dtype=np.uint32)
firm = rng.integers(0, 15, size=n, dtype=np.uint32)
fe = np.column_stack([worker, firm]).astype(np.uint32)
beta = np.array([1.0, -0.6])
y = (
    x @ beta
    + rng.normal(scale=0.8, size=30)[worker]
    + rng.normal(scale=0.5, size=15)[firm]
    + rng.normal(scale=0.15, size=n)
)

model = cm.FixedEffectsOLS()
model.fit(x, fe, y)

cluster = model.summary(vcov="cluster", clusters=worker.astype(np.int64))
cluster

{'coef': array([ 1.01369715, -0.60410017]),
 'coef_se': array([0.00786918, 0.00722453]),
 'vcov_type': 'cluster'}

10.11 MEstimator

This is the lowest-level public interface in the library. The callback contract matters more than the optimizer choice:

objective_fn(theta, data) must return (objective_value, gradient_vector).
score_fn(theta, data) must return an (n_obs, n_params) matrix of per-observation scores.
For bootstrap support, the data dictionary must include n, and user code should respect an optional indices key.

import numpy as np


def ols_objective(theta, data):
    X = data["X"]
    y = data["y"]
    indices = data.get("indices", np.arange(len(y)))
    Xb = X[indices]
    yb = y[indices]
    residual = yb - Xb @ theta
    objective = 0.5 * np.sum(residual**2)
    gradient = -(Xb.T @ residual)
    return objective, gradient


def ols_scores(theta, data):
    X = data["X"]
    y = data["y"]
    residual = y - X @ theta
    return -X * residual[:, None]


rng = np.random.default_rng(7)
X = rng.normal(size=(300, 2))
y = X @ np.array([1.5, -0.4]) + rng.normal(scale=0.1, size=300)

model = cm.MEstimator(
    objective_fn=ols_objective,
    score_fn=ols_scores,
    max_iterations=200,
)
model.fit({"X": X, "y": y, "n": len(y)}, np.zeros(X.shape[1]))

model.summary()

{'coef': array([ 1.49784531, -0.39974307]),
 'se': array([0.60847816, 0.63437929])}

11 Current Sharp Edges

There is still no rich result object layer. Consumers should expect raw dictionaries and NumPy arrays.
Most estimators do not expose fit statistics or formula handling.
GMM is intentionally narrow: closed-form linear IV still lives in TwoSLS, and the generic class only does just-identified Gauss-Newton or two-step overidentified weighting.
MEstimator favors flexibility over guardrails. If your objective has fragile regions, add clipping or other stability controls inside the callback.