colophon
About
Notes on LLM measurement infrastructure and experimental design.
I'm a PhD economist at Claremont Graduate University working on the empirical measurement of language-model behavior. This site collects writeups of the eval pipelines, methodology decisions, and failure modes from that work.
Companion sites: gradstudent.me (CV, papers), games.applesauce.chat (the live behavioral-game lab).
If you're a researcher and want to collaborate, replicate, or argue with anything here, get in touch via the email on gradstudent.me.
Colophon
Set in Spectral (Production Type) for body and headings, and IBM Plex Mono (IBM) for chrome and data. Backend is FastAPI + python-markdown. Posts are markdown files with YAML frontmatter, deployed via git. Source: github.com/Tsangares/research-gradstudent.