Help

Getting Started

FAQ

What is pick-rate?

The percentage of tasks where an AI agent chooses your MCP server’s tool over a named rival’s equivalent tool.

How are scans run?

We run 25 realistic tasks per category through Claude, presenting your tool descriptions alongside your rival’s. The model picks one — we count wins.

What are smells?

Description defects (based on arXiv 2602.14878) that cause models to skip your tool. Common examples: no stated purpose, ambiguous scope, missing parameters.

How do verified rewrites work?

We propose a description fix, re-run the 25-task sim, and only accept rewrites that measurably lift your pick-rate. Regressions are rejected.

What triggers a monitoring alert?

A new model release or a rival schema change that causes a ≥5 pt drop in your pick-rate.

Contact