MCP

5 items

Blog posts

Everyone Is Benchmarking MCP Servers Wrong

Existing MCP benchmarks rank models, not servers. Here's how to A/B test whether your MCP server actually improves agent performance.

AI MCP Research Evaluation

Why I Built mcpbr

MCP developers are shipping tools without evidence they work. I built mcpbr to find out. Here are results from a 500-task controlled SWE-bench experiment that surprised us.

AI MCP Open Source Developer Tools Research Evaluation

Projects

mcpbr

Benchmark runner for Model Context Protocol servers.

20 Python
Python AI MCP Evaluation Developer Tools

claude-chef

Claude Code plugin designed to make Claude a culinary expert.

5 TypeScript
AI MCP TypeScript

mcp-serialization-repro

Do MCP tools serialize in Claude Code? Empirical study: readOnlyHint controls parallelism, IPC overhead is ~5ms/call.

3 Python
MCP Python Research

All tags

AI (5) Cloud Computing (2) C++ (1) Developer Tools (2) Evaluation (3) MCP (5) Open Source (1) Python (2) Research (3) TypeScript (2)