Measuring AI impact (beyond usage dashboards)
Metrics that show whether AI is actually moving the business.
"We rolled out AI to 500 people and saved $3M" is a claim that almost always evaporates under examination. Measuring AI impact honestly takes a little more work and produces much more defensible numbers.
The hierarchy of signals
From weakest to strongest:
- Usage metrics. Logins, queries, seats activated. Confirms people touch the thing.
- Self-reported time savings. Surveys. Useful but optimistic — people over-report gains.
- Before/after task metrics. Same task measured before and after. Cleaner but hard to control for confounds.
- Controlled experiments. Pilot group vs. control group. The gold standard, rarely possible in corporate settings.
Use as much of the ladder as you can. Stop when you have a defensible answer.
The obvious metrics that mislead
- Seats deployed. Correlates only weakly with value. Plenty of seats go cold.
- Queries per user. High query counts include messing around, tutorial runs, and people using the tool to write personal emails on company time.
- Time saved, self-reported. People routinely over-report. Typical bias is 2-3× on the high side.
The metrics that actually matter
- Task completion time, observed or measured. Before AI: 12 minutes to draft a status update. After AI: 4 minutes. Real number.
- Defect rate downstream. If AI-assisted work produces more errors that show up later, the "time saved" was borrowed.
- Quality score on sampled output. A reviewer rates AI-assisted vs. baseline work blind. Catches quality regressions.
- Employee retention / satisfaction. Long-horizon signal that AI tools are genuinely improving work rather than degrading it.
Defining "savings" honestly
When you claim "AI saved us 20% of ticket handle time," interrogate it:
- Is 20% time saved or tickets moved per hour? The first doesn't automatically become the second unless the work queue is constantly full.
- Does "saved time" translate to higher throughput (more tickets resolved) or to less work (people finish earlier)?
- What's the net quality on handled tickets? Faster with more errors is worse.
The ROI formula that stands up
Net ROI = (Benefit × adoption rate × durability) − (License cost + integration + training + ongoing governance)
Details:
- Benefit: realistic per-user value (not the vendor's promise).
- Adoption rate: of people with access, how many actually use it productively. Often 30-70%, rarely 90%+.
- Durability: does the benefit hold after the novelty wears off? Check at 6 and 12 months.
- Costs: include hidden ones — governance, security review, retraining on new models.
Teams that skip any of these terms produce ROI numbers they can't defend in a follow-up review.
The dashboard that works
- Usage over time — trend matters more than snapshot.
- Task-level metrics from the team using it — observed, not self-reported.
- Net Promoter or CSAT — do users say they'd be upset if this were taken away?
- Incident count — has this tool caused any privacy, security, or quality incidents?
- Cost trend — absolute and per-active-user.
The question to ask every quarter
"If we turned this off next Monday, what would break?"
If the answer is "nothing noticeable," it was never a win. If the answer is "we'd need to staff up by X people" or "Y workflow would slow down dramatically" — you have measurable value.
Check your understanding
2-question self-check
Optional. Your answers feed your knowledge score on the track certificate.
Q1.Which metric is WEAKEST as a signal of real AI value?
Q2.The single question that best predicts real value is…
Continue in this track
More lessons from AI for Business Leaders.
Lesson 3
Build vs. buy: the honest trade-offs
When custom is worth it, when a vendor is faster, when the answer is neither.
Lesson 4
Leading an AI adoption without losing your people
Change management for teams who are nervous, curious, or both.
Lesson 6
Governance, risk, and the conversation you'll have with Legal
Data handling, vendor risk, compliance — without the 80-page policy.