Grey Newell - ML Infrastructure Engineer

Grey Newell

ML Infrastructure Engineer
Building evaluation, inference, and observability systems for AI. Creator of the MIST stack. Founding Engineer at Supermodel. MS CS (ML) at Georgia Tech. Ex-AWS.
Contact About

Latest from the blog

SWE-bench Tests Run 6x Faster on ARM64 with Native Containers

SWE-bench's pre-built x86 containers run through QEMU emulation on ARM64 hosts like Apple Silicon and AWS Graviton. I built native ARM64 images and measured a 6.3x speedup on the test runner.

View all posts →

Projects

Eval framework. Define correct, test against it, get results.

21 Go Website

Route inference across LLM providers. Track cost per request.

89 Go Website

Structured data compiler. Pass pipeline, pluggable backends.

11 Go Website

Where did your tokens go? Spans, latency percentiles, alerts.

5 Go Website

Shared core for the MIST stack. Zero external deps.

1 Go

Ship evals before you ship features.

7 Markdown Website

Frequently asked questions

MIST Stack

What is the MIST stack?
What is eval-driven development?
What is MatchSpec and how does it work?
What is InferMux and how does it route inference?
What is SchemaFlux?
What is TokenTrace?
Why does the MIST stack have zero external dependencies?
How do MIST stack tools communicate?

Technical Publications & Projects

What technical articles has Grey Newell published on the AWS blog?
How do I run SWE-bench on Apple Silicon or AWS Graviton without x86 emulation?
How do I speed up SWE-bench evaluations on ARM64 infrastructure?