2025 03 05

up - links to rss feed.

Holmstrom and Milgrom 1991 is essential reading to explain the patterns observed in the workplace. A principal (e.g., a firm) hires an agent (e.g., an employee) to perform multiple tasks. Some tasks are easier to measure or verify (like quantity of sales, number of docs created, number of meetings organised), while others are harder to measure (like quality of code, number of bad ideas killed). Linking compensation or career progression strictly to the measured task—because it’s quantifiable—risks neglect of unmeasured tasks. Agents rationally focus on the task that drives their pay, even if overall performance declines. I’m thinking more about it than usual when failing to maintain a poker-face through Yet Another Meeting that Could Have Been an Email (YAMCHBE).
Wei et al on methods to design adaptive experiments towards learning effect heterogeneity. This is a promising direction as doing policy learning on A/B test data runs into trouble because of tiny effect sizes.
3rd ed of Jurafsky and Martin. classic tome that I learned NLP from. Updated for the LLM age.
Hell’s bells (1929) now in the public domain. Absolutely insane that this was achieved a century ago. Play Cuphead if you’re into this style.
smallpond is a data-processing framework from deepseek built on top of duckdb. Looks like the easiest flow is to run jobs against parquet files. Benchmarks look good.
data-operations benchmarks - init proposed by the folks behind (still my favourite) dataframe library data.table and updated by every new kid on the block. duckdb now rules the roost.
Bergmann, Bonatti, and Smolin on the economics of LLMs. Two-part tariffs, market separation, limited connection to reality.
Barigozzi et al on tensor factor models.
Rauh et al user-friendly (JSS?) paper on matching methods for panel data. This is arguably a more transparent way to do causal inference than the current trend of throwing everything into a fixed effects model and hoping that your FEs do all the work.

shameless self-promotion

fastmatch received a fresh coat of paint, gpu support, and type-hints and testing, and an api website. It is a fast implementation of vanilla and bias-corrected matching estimators that runs the knn-matching step through the very fast faiss library (which also works extremely well on the gpu). Bias-corrected matching is a nice interpretable doubly-robust estimator since the matching step approximates a density ratio; Lin et al for details [and Abadie and Imbens (2006, 2011) for background].
duckreg now has a pdoc-based API documentation website. One of my bugbears with new python libraries is that one often has to read the source code to figure out its API (on that note, the state API documentation for R libraries is even worse), so a doc framework that painlessly writes HTML docs based on docstrings is great.
aipyw, my experimental double-ML/autoDML library also now has a documentation website using the same pdoc framework.