Grey Newell - ML Infrastructure Engineer

Grey Newell

ML Infrastructure Engineer
Building evaluation, inference, and observability systems for AI. Creator of the MIST stack. Founding Engineer at Supermodel. MS CS (ML) at Georgia Tech. Ex-AWS.
Contact About

Latest from the blog

SWE-bench Tests Run 6x Faster on ARM64 with Native Containers

SWE-bench's pre-built x86 containers run through QEMU emulation on ARM64 hosts like Apple Silicon and AWS Graviton. I built native ARM64 images and measured a 6.3x speedup on the test runner.

Implement Event-Driven Invoice Processing for Resilient Financial Monitoring at Scale AWS Architecture Blog

How to build a Business Event Monitoring System (BEMS) on AWS that handles over 86 million daily events with near real-time visibility, cross-Region controls, and automated alerts for stuck events.

Zero to Hero: Your Guide to Career Growth Through AWS Certifications AWS Training and Certification Blog

Learn practical strategies that helped me transform from a struggling new graduate to an AWS Solutions Architect, eventually earning the coveted golden jacket awarded to those who achieve all twelve AWS Certifications.

View all posts →

Projects

Eval framework. Define correct, test against it, get results.

21 Go Website

Route inference across LLM providers. Track cost per request.

89 Go Website

Structured data compiler. Pass pipeline, pluggable backends.

11 Go Website

Where did your tokens go? Spans, latency percentiles, alerts.

5 Go Website

Shared core for the MIST stack. Zero external deps.

1 Go

Ship evals before you ship features.

7 Markdown Website

Frequently asked questions

MIST Stack

What is the MIST stack?
What is eval-driven development?
What is MatchSpec and how does it work?
What is InferMux and how does it route inference?
What is SchemaFlux?
What is TokenTrace?
Why does the MIST stack have zero external dependencies?
How do MIST stack tools communicate?

Technical Publications & Projects

What technical articles has Grey Newell published on the AWS blog?
How do I run SWE-bench on Apple Silicon or AWS Graviton without x86 emulation?
How do I speed up SWE-bench evaluations on ARM64 infrastructure?