crabbymetrics is a compact Rust-backed econometrics library exposed to Python through pyo3 and packaged with maturin. The public API is intentionally small: estimator classes, transformer classes, an Optimizers helper namespace, numpy arrays as the only runtime dependency, and a scikit-adjacent fit / predict / summary pattern.
The page below combines two things:
A sitrep on the current state of the library.
A verified API reference rendered against the live Python module in this repository.
Show code
import inspectfrom html import escapeimport numpy as npimport crabbymetrics as cmfrom IPython.display import HTML, displaydef html_table(headers, rows, table_attrs=""): parts = [f"<table {table_attrs}>".strip(),"<thead>","<tr>",*[f"<th>{escape(str(header))}</th>"for header in headers],"</tr>","</thead>","<tbody>", ]for row in rows: parts.append("<tr>")for cell in row: parts.append(f"<td>{cell}</td>") parts.append("</tr>") parts.extend(["</tbody>", "</table>"])return"".join(parts)
2 Sitrep
The implementation is concentrated in two Rust files: src/lib.rs registers the Python classes and src/estimators.rs holds nearly all estimator logic.
Shared linear algebra, covariance, bootstrap, and NumPy conversion helpers live in src/utils.rs.
Packaging is lean. The Python package declares only numpy at runtime, while native builds are handled by maturin.
Release automation already exists in .github/workflows/wheels.yml for Linux and macOS wheels across Python 3.10 to 3.14.
Coverage is broad for a small codebase: OLS, Ridge, fixed-effects OLS, synthetic control, balancing weights, EPLM, average-derivative estimators, partially linear DML, AIPW, ElasticNet, binary and multinomial logit, Poisson, TwoSLS, FTRL, a first-class callback-driven GMM, a lower-level MEstimator, plus PCA and KernelBasis as design-matrix preprocessors.
The module also exposes an Optimizers class with static methods for LBFGS, BFGS, nonlinear conjugate gradient, Gauss-Newton least squares, and simulated annealing.
The main maturity gaps are still API breadth and test depth. The public contract is now documented, but the estimator surface is intentionally narrow.
The public API is consistent, but not fully uniform. summary() always returns a plain Python dict, yet the exact keys vary by estimator.
TwoSLS now uses the closed-form linear IV / 2SLS estimator and accepts matrix-valued endogenous regressors and excluded instruments.
GMM now sits between the built-in estimators and MEstimator: users supply per-observation moments, optionally a sample Jacobian, and the class handles just-identified versus two-step fitting plus summary-time covariance choices.
MEstimator is still the lowest-level likelihood-style interface. Users are responsible for gradients, per-observation score matrices, and numerical safeguards inside their objective functions.
3 Public Surface
The table below is generated from the live module so the constructor and method signatures match the current build.
minimize_bfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06) minimize_gauss_newton_ls(residual_fn, x0, jacobian_fn, max_iterations=100, tolerance=1e-06) minimize_lbfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06) minimize_nonlinear_cg(fun, x0, grad, max_iterations=100, restart_iters=10, restart_orthogonality=0.1, tolerance=1e-06) minimize_simulated_annealing(fun, x0, lower=None, upper=None, temp=15.0, step_size=0.1, max_iterations=5000, seed=None)
4 Optimizer Surface
The optimizer helpers live under cm.Optimizers as static methods. They mirror the usual scipy pattern: pass a Python callback, a starting value, and any derivative information the method needs, then receive a plain dictionary with x, fun, nit, success, message, and method.
Show code
optimizer_rows = []for name insorted(n for n indir(cm.Optimizers) if n.startswith("minimize_")): fn =getattr(cm.Optimizers, name) optimizer_rows.append( [f"<code>Optimizers.{escape(name)}{escape(str(inspect.signature(fn)))}</code>"] )display(HTML(html_table(["Function"], optimizer_rows)))
Function
Optimizers.minimize_bfgs(fun, x0, grad, max_iterations=100, tolerance=1e-06)
PCA and KernelBasis fit on x alone and return transformed feature matrices that can be passed into the regression estimators.
The Optimizers methods return plain dictionaries rather than scipy result objects, but the schema is intentionally similar.
FixedEffectsOLS.fit(x, fe, y) expects fe to be a 2D uint32 matrix of zero-based category codes, one column per factor.
SyntheticControl.fit(donors, treated) expects donor outcomes in an (n_periods, n_donors) matrix and a treated pre-period outcome vector of length n_periods.
BalancingWeights.fit(covariates, target_covariates, baseline_weights=None, target_weights=None) estimates calibration weights on the first matrix so its weighted covariate mean matches the target mean implied by the second matrix.
EPLM.fit(y, d, w) and AverageDerivative.fit(y, d, w) currently target a scalar continuous treatment and a 2D control matrix.
PartiallyLinearDML.fit(y, d, x) targets a scalar continuous treatment with cross-fit ridge nuisances.
AIPW.fit(y, d, x) targets a binary treatment coded as 0/1 and estimates the ATE with cross-fit ridge nuisance models.
Logit, MultinomialLogit, and FTRL expect integer class labels at fit time.
Most estimators store their training data internally so summary() and bootstrap() can be called after fit().
OLS, Ridge, ElasticNet, Logit, MultinomialLogit, Poisson, and TwoSLS now always fit an intercept.
FixedEffectsOLS, EPLM, AverageDerivative, PartiallyLinearDML, AIPW, GMM, and MEstimator deliberately have no predict() method. They are estimation-first interfaces.
When an estimator has an intercept, bootstrap draws place it in the first column.
summary() returns dictionaries, not rich result objects. Downstream code should index by key.
bootstrap(n_bootstrap, seed=None) now accepts an omitted seed across the full public surface.
OLS, Ridge, FixedEffectsOLS, and TwoSLS now share the same robust-inference surface: summary(vcov="vanilla" | "hc1" | "newey_west" | "cluster", lags=None, clusters=None).
6 Estimator Reference
Show code
reference_rows = [ ["<code>OLS</code>","<code>fit(x, y)</code>","1D float predictions","<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>","<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>","Uses Linfa linear regression; <code>summary(vcov='hc1')</code> is default, and the linear-summary surface also accepts <code>'vanilla'</code>, <code>'newey_west'</code>, and <code>'cluster'</code>.", ], ["<code>Ridge</code>","<code>fit(x, y)</code>","1D float predictions","<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>, <code>penalty</code>, <code>penalties</code>, <code>best_penalty_index</code>, <code>cv_mse</code>, <code>intercept_path</code>, <code>coef_path</code>","<code>(B, p + 1)</code> with intercept","Closed-form L2-regularized least squares solved through an augmented QR least-squares system. Passing a penalty grid runs cross-validation, stores the full coefficient path, refits the selected penalty on the full sample, and exposes the same linear-summary covariance options as OLS.", ], ["<code>FixedEffectsOLS</code>","<code>fit(x, fe_uint32, y)</code>","No dedicated <code>predict()</code> method","<code>coef</code>, <code>coef_se</code>","<code>(B, p)</code>","Uses <code>within</code> to partial out fixed effects, then Linfa OLS with <code>fit_intercept=False</code> on the residualized design. The summary covariance options match the other linear estimators.", ], ["<code>SyntheticControl</code>","<code>fit(donors, treated)</code>","1D float synthetic path from donor outcomes","<code>weights</code>, <code>pre_rmse</code>","<code>(B, n_donors)</code>","Fits nonnegative donor weights that sum to one by minimizing pre-treatment squared error under a simplex constraint.", ], ["<code>BalancingWeights</code>","<code>fit(covariates, target_covariates, baseline_weights=None, target_weights=None)</code>","No dedicated <code>predict()</code> method","<code>weights</code>, <code>beta</code>, <code>weighted_mean</code>, <code>target_mean</code>, <code>mean_diff</code>, <code>l2_diff</code>, <code>effective_sample_size</code>, <code>success</code>","No bootstrap method","Dual calibration solver for entropy and quadratic balancing weights. Intended for ATT-style reweighting, covariate shift, and domain adaptation tasks where one sample is reweighted to match another sample's covariate mean.", ], ["<code>EPLM</code>","<code>fit(y, d, w)</code>","No dedicated <code>predict()</code> method","<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>nuisance_coef</code>","No bootstrap method","Robins-Newey style partially linear E-estimator implemented as an exactly identified stacked-moment system with a linear working model for <code>E[D|W]</code>.", ], ["<code>AverageDerivative</code>","<code>fit(y, d, w)</code>","No dedicated <code>predict()</code> method","<code>method</code>, <code>coef</code>, <code>se</code>, <code>vcov</code>","No bootstrap method","Graham-Pinto style average-derivative estimator with <code>method='ob' | 'ipw' | 'dr'</code> under a scalar continuous-treatment working model.", ], ["<code>PartiallyLinearDML</code>","<code>fit(y, d, x)</code>","No dedicated <code>predict()</code> method","<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>outcome_penalties</code>, <code>treatment_penalties</code>","No bootstrap method","Cross-fit ridge implementation of the partially linear Double-ML score. The nuisance fits are selected fold by fold from a common ridge penalty grid.", ], ["<code>AIPW</code>","<code>fit(y, d, x)</code>","No dedicated <code>predict()</code> method","<code>ate</code>, <code>se</code>, <code>vcov</code>, <code>outcome0_penalties</code>, <code>outcome1_penalties</code>, <code>propensity_penalties</code>","No bootstrap method","Cross-fit ridge AIPW estimator for a binary-treatment ATE. Outcome regressions are fit separately by treatment arm and the ridge propensity fit is clipped away from 0 and 1.", ], ["<code>ElasticNet</code>","<code>fit(x, y)</code>","1D float predictions","<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>","<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>","Regularized point estimates with HC1-style covariance built from residuals.", ], ["<code>Logit</code>","<code>fit(x, y_int32)</code>","1D <code>int32</code> class labels","<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>","<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>","Binary logistic regression. Standard errors come from the Fisher information.", ], ["<code>MultinomialLogit</code>","<code>fit(x, y_int32)</code>","1D <code>int32</code> class labels","<code>coef</code> and <code>se</code> matrices of shape <code>(classes, features_with_intercept)</code>","<code>(B, classes * features_with_intercept)</code>","Intercept and slopes are packed together by class in the summary matrix.", ], ["<code>Poisson</code>","<code>fit(x, y)</code>","1D float mean counts","<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>","<code>(B, p + 1)</code> with intercept, else <code>(B, p)</code>","Custom Newton-CG Poisson MLE with optional L2 penalty via <code>alpha</code>; <code>summary(vcov='vanilla')</code> gives Fisher SEs and <code>summary(vcov='sandwich')</code> gives QMLE sandwich SEs.", ], ["<code>TwoSLS</code>","<code>fit(x_endog, x_exog, z, y)</code>","1D float predictions from the stage-two design matrix","<code>intercept</code>, <code>coef</code>, <code>intercept_se</code>, <code>coef_se</code>","<code>(B, k + 1)</code> with intercept, else <code>(B, k)</code>","Closed-form linear IV / 2SLS estimator. Supports multiple endogenous regressors, and its summary method now shares the same <code>vanilla</code>, <code>hc1</code>, <code>newey_west</code>, and <code>cluster</code> covariance interface as the other linear estimators.", ], ["<code>FTRL</code>","<code>fit(x, y_int32)</code>","1D float scores in <code>[0, 1]</code>","<code>coef</code>, <code>coef_se</code>","<code>(B, p)</code>","Online-style classification model with a different summary schema and no explicit intercept output.", ], ["<code>GMM</code>","<code>fit(data_dict, theta0, weighting='auto')</code>","No dedicated <code>predict()</code> method","<code>coef</code>, <code>se</code>, <code>vcov</code>, <code>criterion</code>, <code>nit</code>, <code>weighting</code>, <code>vcov_type</code>, <code>omega_type</code>, <code>weight_matrix</code>, <code>nobs</code>, <code>n_moments</code>, <code>j_stat</code>, <code>j_df</code>","No bootstrap method","Requires <code>moment_fn(theta, data)</code> returning an <code>(n, m)</code> matrix of per-observation moments and optionally <code>jacobian_fn(theta, data)</code> returning the sample Jacobian of shape <code>(m, p)</code>.", ], ["<code>MEstimator</code>","<code>fit(data_dict, theta0)</code>","No dedicated <code>predict()</code> method","<code>coef</code>, <code>se</code>","<code>(B, p)</code>","Requires <code>objective_fn(theta, data) -> (obj, grad)</code> and <code>score_fn(theta, data)</code> returning an <code>(n, p)</code> matrix.", ],]display( HTML("<div style='width: 100%; overflow-x: auto;'>"+ html_table( ["Estimator", "Fit", "Predict", "Summary Keys", "Bootstrap", "Notes"], reference_rows, table_attrs="style='width: 100%; min-width: 100%; table-layout: auto;'", )+"</div>" ))
Estimator
Fit
Predict
Summary Keys
Bootstrap
Notes
OLS
fit(x, y)
1D float predictions
intercept, coef, intercept_se, coef_se
(B, p + 1) with intercept, else (B, p)
Uses Linfa linear regression; summary(vcov='hc1') is default, and the linear-summary surface also accepts 'vanilla', 'newey_west', and 'cluster'.
Closed-form L2-regularized least squares solved through an augmented QR least-squares system. Passing a penalty grid runs cross-validation, stores the full coefficient path, refits the selected penalty on the full sample, and exposes the same linear-summary covariance options as OLS.
FixedEffectsOLS
fit(x, fe_uint32, y)
No dedicated predict() method
coef, coef_se
(B, p)
Uses within to partial out fixed effects, then Linfa OLS with fit_intercept=False on the residualized design. The summary covariance options match the other linear estimators.
SyntheticControl
fit(donors, treated)
1D float synthetic path from donor outcomes
weights, pre_rmse
(B, n_donors)
Fits nonnegative donor weights that sum to one by minimizing pre-treatment squared error under a simplex constraint.
Dual calibration solver for entropy and quadratic balancing weights. Intended for ATT-style reweighting, covariate shift, and domain adaptation tasks where one sample is reweighted to match another sample's covariate mean.
EPLM
fit(y, d, w)
No dedicated predict() method
coef, se, vcov, nuisance_coef
No bootstrap method
Robins-Newey style partially linear E-estimator implemented as an exactly identified stacked-moment system with a linear working model for E[D|W].
AverageDerivative
fit(y, d, w)
No dedicated predict() method
method, coef, se, vcov
No bootstrap method
Graham-Pinto style average-derivative estimator with method='ob' | 'ipw' | 'dr' under a scalar continuous-treatment working model.
PartiallyLinearDML
fit(y, d, x)
No dedicated predict() method
coef, se, vcov, outcome_penalties, treatment_penalties
No bootstrap method
Cross-fit ridge implementation of the partially linear Double-ML score. The nuisance fits are selected fold by fold from a common ridge penalty grid.
AIPW
fit(y, d, x)
No dedicated predict() method
ate, se, vcov, outcome0_penalties, outcome1_penalties, propensity_penalties
No bootstrap method
Cross-fit ridge AIPW estimator for a binary-treatment ATE. Outcome regressions are fit separately by treatment arm and the ridge propensity fit is clipped away from 0 and 1.
ElasticNet
fit(x, y)
1D float predictions
intercept, coef, intercept_se, coef_se
(B, p + 1) with intercept, else (B, p)
Regularized point estimates with HC1-style covariance built from residuals.
Logit
fit(x, y_int32)
1D int32 class labels
intercept, coef, intercept_se, coef_se
(B, p + 1) with intercept, else (B, p)
Binary logistic regression. Standard errors come from the Fisher information.
MultinomialLogit
fit(x, y_int32)
1D int32 class labels
coef and se matrices of shape (classes, features_with_intercept)
(B, classes * features_with_intercept)
Intercept and slopes are packed together by class in the summary matrix.
Poisson
fit(x, y)
1D float mean counts
intercept, coef, intercept_se, coef_se
(B, p + 1) with intercept, else (B, p)
Custom Newton-CG Poisson MLE with optional L2 penalty via alpha; summary(vcov='vanilla') gives Fisher SEs and summary(vcov='sandwich') gives QMLE sandwich SEs.
TwoSLS
fit(x_endog, x_exog, z, y)
1D float predictions from the stage-two design matrix
intercept, coef, intercept_se, coef_se
(B, k + 1) with intercept, else (B, k)
Closed-form linear IV / 2SLS estimator. Supports multiple endogenous regressors, and its summary method now shares the same vanilla, hc1, newey_west, and cluster covariance interface as the other linear estimators.
FTRL
fit(x, y_int32)
1D float scores in [0, 1]
coef, coef_se
(B, p)
Online-style classification model with a different summary schema and no explicit intercept output.
Requires moment_fn(theta, data) returning an (n, m) matrix of per-observation moments and optionally jacobian_fn(theta, data) returning the sample Jacobian of shape (m, p).
MEstimator
fit(data_dict, theta0)
No dedicated predict() method
coef, se
(B, p)
Requires objective_fn(theta, data) -> (obj, grad) and score_fn(theta, data) returning an (n, p) matrix.
7 Transformer Reference
Show code
transformer_rows = [ ["<code>PCA</code>","<code>fit(x)</code>","<code>transform(x)</code>, <code>fit_transform(x)</code>, <code>inverse_transform(scores)</code>","<code>n_components</code>, <code>n_features</code>, <code>whiten</code>, <code>components</code>, <code>mean</code>, <code>explained_variance</code>, <code>explained_variance_ratio</code>, <code>singular_values</code>","Principal-components basis estimated from <code>x</code> alone. Useful for principal-components regression and other low-rank pipelines.", ], ["<code>KernelBasis</code>","<code>fit(x)</code>","<code>transform(x)</code>, <code>fit_transform(x)</code>","<code>kernel</code>, <code>n_train</code>, <code>n_features</code>, <code>bandwidth</code>, <code>coef0</code>, <code>degree</code>, <code>diagonal</code>","Stores the fitted training design and returns kernel features against that basis, so downstream regressions can work on nonlinear right-hand sides.", ],]display( HTML( html_table( ["Transformer", "Fit", "Transform", "Summary Keys", "Notes"], transformer_rows, ) ))
Least-squares Gauss-Newton routine for moment or residual systems.
Optimizers.minimize_simulated_annealing
fun(theta) -> float
x, fun, nit, success, message, method
Stochastic global optimizer with optional box bounds, temperature, and RNG seed.
9 Verified Runtime Snapshot
The next chunk fits each estimator on synthetic data and records the key summary schema and bootstrap output shape. This acts as a lightweight executable check that the reference above matches the current build.
This estimator follows the Frisch-Waugh-Lovell path directly: partial out the fixed effects with within, then regress the residualized outcome on the residualized regressors with no intercept.
There is still no rich result object layer. Consumers should expect raw dictionaries and NumPy arrays.
Most estimators do not expose fit statistics or formula handling.
GMM is intentionally narrow: closed-form linear IV still lives in TwoSLS, and the generic class only does just-identified Gauss-Newton or two-step overidentified weighting.
MEstimator favors flexibility over guardrails. If your objective has fragile regions, add clipping or other stability controls inside the callback.