ML infrastructure engineer building evaluation, inference, and observability systems for AI. Creator of the MIST stack (MatchSpec, InferMux, SchemaFlux, TokenTrace), an eval and inference platform written in Go with zero external dependencies. Founding Engineer at Supermodel, building code analysis tooling for AI agents. Previously built distributed systems at Amazon and AWS. Published research on benchmarking Model Context Protocol servers.
Route inference across LLM providers. Track cost per request.
Eval framework. Define correct, test against it, get results.
Ship evals before you ship features.
Structured data compiler. Pass pipeline, pluggable backends.
Serverless event-driven architecture for processing millions of daily events with near real-time visibility and strong resilience.
TypeScript SDK for Supermodel. Generate useful graphs of your codebase.
Benchmark runner for Model Context Protocol servers. Paired comparison experiments on SWE-bench.
OpenAPI spec for the Supermodel public API. Use as reference or generate your own clients.
Supermodel MCP server. Generate code graphs in Cursor, Codex, or Claude Code.
GitHub Action to generate architecture documentation for any repository using Supermodel.
Where did your tokens go? Spans, latency percentiles, alerts.
GitHub Action to find unreachable functions using Supermodel call graphs.
Shared core for the MIST stack. Zero external deps.
GitHub Pages site for Supermodel Tools.
Hit me up. I'm always down to collab.
me at greynewell dot com