MCP tool selection · measured against named rivals

Does Claude pick your tool, or a rival’s?

ToolRank measures your MCP server’s pick-rate against rivals on Claude, finds the description smells losing you calls, and proposes rewrites verified to lift it.

Free loss card first — no signup to look.

Live leaderboard · Claude Opus 4.x · refreshed daily

Payments·Claude Opus 4.x
live
#serverpick-rateΔstatus
  • 1
    Stripestripe/agent-toolkit
    72%
    +4
    GOOD
  • 2
    lemonpayindie-dev/lemonpay-mcp
    58%
    +9
    GOOD
  • 3
    Squaresquare/square-mcp
    49%
    3
    GOOD
  • 4
    PayPalpaypal/agent-toolkit
    41%
    31
    LOSES
  • 5
    Adyenadyen/adyen-mcp
    38%
    6
    LOSES

10 servers ranked · 2 categories

Claude Opus 4.x · refreshed daily

Your dashboard shows the calls you won. Never the ones you lost.

You ship the MCP server, watch usage climb, and assume the description is fine. But 97.1% of real tool descriptions carry at least one defect that misroutes selection, and 56% never state what the tool is for. The agent picks the rival. You see a flat usage chart and never learn why.

97.1%of 856 real tools carry a description smellarXiv 2602.14878
56%never state their purpose
1 in 6DIY rewrites make pick-rate worse, with no signal that they did

What $49 actually runs.

01Real tasks.We generate 20–25 realistic jobs in your category and present your tool beside its named rivals.
02Every pick recorded.Claude chooses. We log selections/tasks — your pick-rate, and exactly which tasks you lost.
03Verified rewrites.We propose fixes, re-simulate each one, and report the measured before/after lift. Regressions get rejected, not shipped.

You get a shareable report: task-by-task breakdown, the full smell audit, and rewrites with numbers you can trust.

Know your pick-rate

One number tells you what share of category tasks route to you on Claude.

pick-rate
41%vs rivals avg 72%
41% of category tasks → you

Beat your rival

See the head-to-head: tasks you won, tasks a named rival took.

head-to-head
vsstripe/agent-toolkit
7/25 tasks won · 18 lost

Verified rewrites

Before → after on the real description, with the measured lift attached.

create_paymentNo stated purpose
Creates a payment.
+Initiates a payment intent for one-time or recurring charges…
41% 55%+14 lift · verified

The evidence

“Unmeasured description rewrites regress performance 1 in 6.”
arXiv 2602.14878, 856 tools across 103 real MCP servers

97.1% of tool descriptions carry at least one quality smell; 56% never state the tool’s purpose — the field agents weight most.

paypal/agent-toolkit41%31
loses tostripe/agent-toolkit72%
top smell:No stated purpose

Simple, honest pricing

Run a scan →

Free

FREE
$0

The public leaderboard and your loss card.

What you measure
rank
top finding
pick-rate
findings depth
rewrite lift
Browse leaderboard

Scan

SCAN
$49one-time

One competitive scan, full report.

What you measure
pick-rate
all findings
rewrite lift
rival gap
tool coverage
Executes shell commands on the host
Runs sandboxed shell commands and returns stdout
Scan my server

Monitor

SCAN
$39/mo

Every model release reshuffles the board. Stay on it.

What you measure
pick-rate
drift vs prev
release-by-release
alert threshold
history depth
Executes shell commands on the host
Runs sandboxed shell commands and returns stdout
Start monitoring

white-hat rewrites only · public correction policy · cancel anytime

Pay once to fix it. Subscribe to keep it fixed as models change.

Questions, answered

Find your row. See what you’re losing.

The leaderboard is public and free. Your pick-rate and two findings cost nothing. The full scan costs $49.