Blog · 10 min read · May 16, 2026

Measuring AI agent ROI.

A four-quadrant framework for valuing an agent, with worked examples and the parts that are hardest to attribute.

"What's the ROI?" is the single most common question we get on agent engagements, and the question most under-served by existing AI marketing. Vendor decks throw around "100× productivity!" numbers that fall apart on contact with finance teams. The honest framework is more boring and more useful.

We use a four-quadrant model with our clients. It's simple enough to fit on a slide and rigorous enough to defend in a procurement review.

The return is real when the work is scaled rather than piloted. MIT Sloan Management Review reports that organizations with strong financial performance are 4.5 times more likely to have invested in agentic AI architectures, and that the gains come from scaling agents into production, not from experiments that stall after a demo. That matches what we see: the value compounds in the Operations phase, once an agent is live and improving.

The four sources of agent ROI

Quadrant 1

Deflected work

Work that would have required staff time but doesn't anymore because the agent did it. The most measurable quadrant. If 5,000 tier-1 inquiries/month used to take 15 min each and now 60% are handled by the agent, that's 1,500 hours/month of recovered staff time. Multiply by burdened hourly cost.

Quadrant 2

Accelerated work

Work that still requires staff, but takes much less time because the agent did the prep. A draft that took 2 hours now takes 20 minutes because the agent provided the structured starting point. Harder to attribute cleanly than deflection, but real.

Quadrant 3

Recovered revenue

Revenue that would have leaked without the agent. Leads qualified in real-time that would otherwise have gone cold. Churn signals flagged early enough to act. Upsell opportunities surfaced inside support conversations. Hardest to attribute but often the biggest quadrant.

Quadrant 4

Risk reduction

Costs avoided. Compliance violations that didn't happen because the agent enforced policy consistently. Audit packages assembled in hours instead of weeks. Documentation produced contemporaneously instead of reconstructed under deadline. Hard to assign a dollar figure; easy to defend in a board conversation.

Setting the baseline (the part most people skip)

ROI requires a baseline. Without one you're measuring change relative to a vague memory. The Strategy sprint we run on every engagement captures the baseline explicitly, before any agent ships:

For deflection workflows: volume of inquiries/month, median handle time, current staff fully-loaded cost, current after-hours coverage gaps
For acceleration workflows: task type, current median time-to-completion, current quality variance (consistency between staff), throughput limits
For revenue workflows: current conversion rates at each funnel step, current response time on inbound leads, current churn signals not acted on
For risk workflows: current compliance-incident rate, current audit-prep cycle time, current documentation-completeness measures

The biggest reason "AI ROI" arguments fail isn't bad measurement, it's no baseline. If you didn't measure the pre-state, you'll get into a slow argument about whether things actually improved.

A worked example: customer-service tier-1 deflection

Hypothetical mid-market SaaS company. Customer support team of 12. Volume: 8,000 inbound tickets/month, ~3,500 tier-1 (FAQ-style), ~2,500 tier-2 (account-specific but routine), ~2,000 tier-3 (escalations, edge cases).

Baseline:

Burdened cost per CSR-hour: $70
Median handle time tier-1: 8 minutes
Tier-1 monthly cost: 3,500 × 8 min × $70/hr = ~$32,700
Tier-1 backlog frequently 24–48 hours during volume spikes

Post-agent (Q4 after rollout):

~62% of tier-1 handled by agent (citation-grounded, with reroute on low confidence)
Tier-1 deflection: ~2,170/month, staff cost reduced by ~$20,000/month
Backlog under 4 hours even on peak days; CSAT up 0.3 points on tier-1 resolved cases
~10 hours/week of senior CSR time freed for tier-3 mentoring and process improvement

Annualized: ~$240K deflected work. Plus the harder-to-attribute Quadrant 3 win, recovered churn from faster response times, that the customer success team estimated at another ~$150K/year. Against an annual cost of agent-engagement + inference + infra of roughly $230K, year-one ROI on this single workflow comes in at roughly 70%, breakeven before month 8. Year-two ROI is materially better because the engagement fees front-load.

A worked example: regulated-workflow risk reduction

A specialty healthcare provider running prior-authorization at scale. The dollar figures are harder to compute, but the risk-quadrant calculation matters.

Baseline:

Median PA turnaround: 4 business days
First-pass approval rate: 68%
Appeals required on ~32% of submissions; appeal cycle 7–14 days
Documentation-completeness audit findings: 3–5 per quarter

Post-agent (see case study):

Median PA turnaround: under 2 days
First-pass approval rate up to 81% (more complete packets, all clinical assertions cited)
Audit findings near zero, every citation linked to source artifact in the chart

The deflection quadrant in this case is real (PA staff time reduced) but the Quadrant 4 risk-reduction is what made the engagement actually fundable. Lower denial rates, smaller appeals workload, better audit posture, those are the numbers that defend the engagement at the next compliance committee meeting.

What's hardest to attribute

Two things consistently. First, Quadrant 3 revenue. When the agent helps a CSR upsell a customer, did the CSR drive the sale or did the agent? The honest answer is "both." Don't claim 100% attribution; claim the marginal lift compared to the pre-agent baseline and move on.

Second, Quadrant 2 acceleration. When a draft used to take a senior associate 2 hours and now takes 30 minutes, the time savings are real, but did the team work less or take on more work? Usually the latter, which is great but harder to dollar-quantify. We typically report acceleration in throughput terms (e.g., "team now produces 2.4× the brief drafts per week"), not pure cost savings.

Time horizon

Most production agents we've shipped break even somewhere between month 6 and month 12 on the deflection quadrant alone. The compounding value lives in the later quadrants and shows up in Year 2 and beyond. ROI in Year 1 is often 50–100%; ROI by Year 3 is typically 200–400% as the agent climbs the capability ladder.

The engagements that didn't hit those numbers usually had one of three problems: no baseline (so improvements were unprovable), no Quadrant 4 measurement (so the agent's risk-reduction value was invisible), or an operations engagement that stopped after launch (so the agent never improved past its starting capability).

How we report ROI to our clients

Every monthly Operations report in our engagements includes:

Deflection metrics: volume handled by agent, equivalent staff hours, fully-loaded cost equivalent
Acceleration metrics: task throughput vs. baseline, time-per-task delta
Revenue-side metrics: conversion lift on agent-touched leads, churn-signal action rate, recovered revenue (with attribution methodology stated)
Risk metrics: compliance-incident rate, audit-prep cycle time, documentation-completeness, citation-accuracy on every output
Cost line: engagement + inference + infrastructure
Net: the ROI number, with the methodology footnoted

Partners don't have to ask whether the agent is paying for itself. The number is in the report every month.

Worth taking

If you're scoping an AI agent engagement and the vendor can't tell you which of the four quadrants the agent will move and how they'll measure it, ask. The answer should be specific, baseline-anchored, and reportable monthly. If it isn't, the ROI conversation a year from now will be unwinnable.

More reading: pricing, how much agents cost, case studies, What Is Agentic AI?.