crabbymetrics
crabbymetrics is a Rust-backed econometrics library with a compact Python API. The docs are organized around the public API, a single binding crash course, an active First Course Ding translation track, supervised and semiparametric example pages, unsupervised transform notes, focused numerical ablations, and a small set of supporting internals pages.
Start Here
- API reference: verified public surface, summary schemas, optimizer catalog, and runtime smoke checks.
- First Course Ding: chapter-by-chapter translation plan and current implementation status for the Peng Ding notebooks.
- Binding Crash Course: OLS: the shortest end-to-end walkthrough of the Rust-to-Python wrapper pattern in this codebase.
First Course Ding
- Overview And TOC: planning page and chapter map for the full Ding translation pass.
- Translated so far: Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 9, Chapter 11, Chapter 12, Chapter 13, Chapter 21, and Chapter 23.
- The next obvious tranche is Chapters 15 through 20, where matching, sensitivity analysis, and RD helpers start to matter.
Supervised Learning Examples
- OLS: baseline linear regression with switchable vanilla and HC1 covariance estimators.
- Ridge: closed-form L2-regularized least squares with a scalar penalty or a penalty grid plus cross-validation.
- Fixed Effects OLS: partial out one-way or multi-way categorical fixed effects with
within, then estimate slopes without an intercept. - ElasticNet: regularized linear regression.
- Synthetic Control: simplex-constrained donor weighting for treated-versus-donor panel matching under latent-factor drift.
- Logit: binary logistic regression.
- Multinomial Logit: multiclass classification.
- Poisson: count regression, with model-based or QMLE sandwich inference available from
summary(vcov=...). - TwoSLS: instrumental variables regression, including multiple endogenous regressors and multiple excluded instruments.
- GMM: fit just-identified score equations, overidentified two-step IV moments, and stacked nuisance-parameter moments with the first-class
GMMestimator. - FTRL: online-style binary classification.
- MEstimator Poisson: callback-driven estimation matched against the built-in Poisson estimator.
Semiparametric Examples
- Balancing Weights: entropy and quadratic calibration weights for ATT-style reweighting, covariate shift, and domain adaptation.
- EPLM: the Robins-Newey partially linear E-estimator, implemented as a stacked-moment estimator for a scalar continuous treatment.
- Average Derivative: Oaxaca-Blinder, generalized IPW, and doubly robust average-derivative estimators in the Graham-Pinto style under a continuous-treatment working model.
- Double ML And AIPW: cross-fit ridge nuisance estimation for partially linear DML and binary-treatment AIPW.
- Richer Regression With Transformer Pipelines: a longer semiparametric-style example built on
KernelBasisandPCA.
Unsupervised Learning Examples
- PCA And Kernel Basis: transformer examples for richer design matrices and nonlinear geometry.
Ablations
- Variance Estimators: cached Monte Carlo coverage experiments for OLS, Poisson, and GMM variance estimators under heteroskedasticity or overdispersion.
- Semiparametric Estimator Comparisons: cached Monte Carlo comparisons for EPLM versus partially linear DML under continuous-treatment misspecification, and for vanilla regression versus balancing weights versus AIPW under binary treatment.
- Bridging Finite And Superpopulation: cached Monte Carlo coverage comparisons for HC2, Ding’s closed-form correction, a raw-row bootstrap, and stacked GMM when the same adjusted estimator is evaluated against SATE and PATE.
Optimization
- Optimizers: direct optimizer usage for smooth likelihoods, rougher objective surfaces, and solver behavior comparisons.
- GMM With Optimizers: the lower-level notebook that motivated the first-class
GMMestimator and still shows the residual-collection view directly.
Supporting Pages
- Binding Internals: Poisson: a deeper built-in-estimator walkthrough after the OLS crash course.
- Binding Internals: MEstimator: the callback-heavy bridge where Rust owns optimization and Python supplies the objective and scores.
Runtime Snapshot
crabbymetrics, scikit-learn, and statsmodels. Lower is faster.The benchmark figure is generated from the repo-level benchmark assets in benchmarks/ and gives a quick scale check for the estimators that overlap cleanly with mainstream Python baselines.
Notes
api.qmdremains the main documentation page and renders toapi.html.- The site is a Quarto website, so shared navigation and search are generated under
docs/. - All pages are rendered with embedded resources so the checked-in HTML files remain self-contained.