crabbymetrics
  • Home
  • API
  • Binding Crash Course
  • Supervised Learning
    • OLS
    • Ridge
    • Fixed Effects OLS
    • ElasticNet
    • Synthetic Control
    • Logit
    • Multinomial Logit
    • Poisson
    • TwoSLS
    • GMM
    • FTRL
    • MEstimator Poisson
  • Semiparametrics
    • Balancing Weights
    • EPLM
    • Average Derivative
    • Double ML And AIPW
    • Richer Regression
  • Unsupervised Learning
    • PCA And Kernel Basis
  • Ablations
    • Variance Estimators
    • Semiparametric Estimator Comparisons
    • Bridging Finite And Superpopulation
  • Optimization
    • Optimizers
    • GMM With Optimizers
  • Ding: First Course
    • Overview And TOC
    • Ch 1 Correlation And Simpson
    • Ch 2 Potential Outcomes
    • Ch 3 CRE And Fisher RT
    • Ch 4 CRE And Neyman
    • Ch 9 Bridging Finite And Superpopulation
    • Ch 11 Propensity Score
    • Ch 12 Double Robust ATE
    • Ch 13 Double Robust ATT
    • Ch 21 Experimental IV
    • Ch 23 Econometric IV

On this page

  • Start Here
  • First Course Ding
  • Supervised Learning Examples
  • Semiparametric Examples
  • Unsupervised Learning Examples
  • Ablations
  • Optimization
  • Supporting Pages
  • Runtime Snapshot
  • Notes

crabbymetrics

crabbymetrics logo

crabbymetrics logo

crabbymetrics is a Rust-backed econometrics library with a compact Python API. The docs are organized around the public API, a single binding crash course, an active First Course Ding translation track, supervised and semiparametric example pages, unsupervised transform notes, focused numerical ablations, and a small set of supporting internals pages.

Start Here

  • API reference: verified public surface, summary schemas, optimizer catalog, and runtime smoke checks.
  • First Course Ding: chapter-by-chapter translation plan and current implementation status for the Peng Ding notebooks.
  • Binding Crash Course: OLS: the shortest end-to-end walkthrough of the Rust-to-Python wrapper pattern in this codebase.

First Course Ding

  • Overview And TOC: planning page and chapter map for the full Ding translation pass.
  • Translated so far: Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 9, Chapter 11, Chapter 12, Chapter 13, Chapter 21, and Chapter 23.
  • The next obvious tranche is Chapters 15 through 20, where matching, sensitivity analysis, and RD helpers start to matter.

Supervised Learning Examples

  • OLS: baseline linear regression with switchable vanilla and HC1 covariance estimators.
  • Ridge: closed-form L2-regularized least squares with a scalar penalty or a penalty grid plus cross-validation.
  • Fixed Effects OLS: partial out one-way or multi-way categorical fixed effects with within, then estimate slopes without an intercept.
  • ElasticNet: regularized linear regression.
  • Synthetic Control: simplex-constrained donor weighting for treated-versus-donor panel matching under latent-factor drift.
  • Logit: binary logistic regression.
  • Multinomial Logit: multiclass classification.
  • Poisson: count regression, with model-based or QMLE sandwich inference available from summary(vcov=...).
  • TwoSLS: instrumental variables regression, including multiple endogenous regressors and multiple excluded instruments.
  • GMM: fit just-identified score equations, overidentified two-step IV moments, and stacked nuisance-parameter moments with the first-class GMM estimator.
  • FTRL: online-style binary classification.
  • MEstimator Poisson: callback-driven estimation matched against the built-in Poisson estimator.

Semiparametric Examples

  • Balancing Weights: entropy and quadratic calibration weights for ATT-style reweighting, covariate shift, and domain adaptation.
  • EPLM: the Robins-Newey partially linear E-estimator, implemented as a stacked-moment estimator for a scalar continuous treatment.
  • Average Derivative: Oaxaca-Blinder, generalized IPW, and doubly robust average-derivative estimators in the Graham-Pinto style under a continuous-treatment working model.
  • Double ML And AIPW: cross-fit ridge nuisance estimation for partially linear DML and binary-treatment AIPW.
  • Richer Regression With Transformer Pipelines: a longer semiparametric-style example built on KernelBasis and PCA.

Unsupervised Learning Examples

  • PCA And Kernel Basis: transformer examples for richer design matrices and nonlinear geometry.

Ablations

  • Variance Estimators: cached Monte Carlo coverage experiments for OLS, Poisson, and GMM variance estimators under heteroskedasticity or overdispersion.
  • Semiparametric Estimator Comparisons: cached Monte Carlo comparisons for EPLM versus partially linear DML under continuous-treatment misspecification, and for vanilla regression versus balancing weights versus AIPW under binary treatment.
  • Bridging Finite And Superpopulation: cached Monte Carlo coverage comparisons for HC2, Ding’s closed-form correction, a raw-row bootstrap, and stacked GMM when the same adjusted estimator is evaluated against SATE and PATE.

Optimization

  • Optimizers: direct optimizer usage for smooth likelihoods, rougher objective surfaces, and solver behavior comparisons.
  • GMM With Optimizers: the lower-level notebook that motivated the first-class GMM estimator and still shows the residual-collection view directly.

Supporting Pages

  • Binding Internals: Poisson: a deeper built-in-estimator walkthrough after the OLS crash course.
  • Binding Internals: MEstimator: the callback-heavy bridge where Rust owns optimization and Python supplies the objective and scores.

Runtime Snapshot

Runtime comparison benchmark across crabbymetrics, scikit-learn, and statsmodels

Repo-level runtime comparison for OLS, Logit, Poisson, and MultinomialLogit across crabbymetrics, scikit-learn, and statsmodels. Lower is faster.

The benchmark figure is generated from the repo-level benchmark assets in benchmarks/ and gives a quick scale check for the estimators that overlap cleanly with mainstream Python baselines.

Notes

  • api.qmd remains the main documentation page and renders to api.html.
  • The site is a Quarto website, so shared navigation and search are generated under docs/.
  • All pages are rendered with embedded resources so the checked-in HTML files remain self-contained.