Open Source Built by Netlify, for the agent web

Stop guessing if agents
can use your service.

AXIS is an open scoring framework and CLI that measures how well your project, APIs, and tooling actually work when an AI agent tries to use them. Run real agents against real scenarios, capture every tool call, and get a comparable 0–100 score back.

1 Install $npm i -g @netlify/axis
2 Set up $axis init
3 Run $axis run
The framework

Four dimensions, one score.

A single pass/fail tells you nothing about why an agent struggled. AXIS scores four independent dimensions so you can focus on what matters: a slow API, a confusing layout, a noisy tool, or the agent's own decisions.

The report

Every run, fully inspectable.

AXIS doesn't just hand you a number. Each run produces a self-contained HTML report with the full transcript, every tool call, the LLM judge's per-criterion grading, and a sparse-index view that's optimized for skimming. Try the live sample below. Click around, expand interactions, read the rubric grades.

.axis/reports/sample/report.html
Open full
Sample data: 6 scenarios, 3 agents, generated at docs build time. Interactive.
The workflow

Score it. Baseline it. Gate CI on it.

AXIS is built to slot into the same place as your unit tests, just for agent experience. Run it locally to iterate, then turn on baselines and let your CI catch regressions automatically.

1
Define a scenario

A JSON file with a prompt, a rubric, and optional setup steps. Five lines of YAML is enough to start.

2
Run any agent

40+ built-in agents including Claude Code, Codex, Gemini, plus any ACP-compliant agent. Bring your own with a small adapter.

3
Set a baseline

Snapshot a passing run, commit the baseline, and any future regression beyond noise tolerance fails the build.

Ship for humans and the agents they use.

AXIS is open source and free.
You can wire it into your project in a couple of minutes.