Measuring Impact
Everyone wants to measure agent impact. Most measurements are wrong. And sometimes agents slow you down.
The Measurement Trap
Section titled âThe Measurement TrapâWhen you measure wrong things, people optimize for metrics, not outcomes.
| Bad metric | Gaming behavior |
|---|---|
| Lines of code generated | Verbose, less clean code |
| Tasks completed per sprint | Task inflation, tiny pieces |
| Time using AI tools | Running agents on things faster done manually |
What to Actually Measure
Section titled âWhat to Actually MeasureâLeading Indicators (early signals)
Section titled âLeading Indicators (early signals)â- Acceptance rate: What % of suggestions accepted vs. rejected? Low rates suggest poor fit or skill gaps.
- Iteration count: How many prompt cycles before useful output? Decreasing = improving skills.
- Task scope: Are engineers tackling larger tasks with agent help? Growing confidence.
- Review feedback: Are reviewers catching fewer issues in agent-assisted PRs over time?
Lagging Indicators (outcomes)
Section titled âLagging Indicators (outcomes)â- Velocity: Look at trends, not absolutes. Compare to teams not using agents. (Carefulâgameable.)
- Bug rates: Bugs per feature changing? Account for code attribution.
- Time to production: Feature start to deploy. Harder to game.
- Developer satisfaction: Survey your team. Happy devs are productive devs.
What Not to Measure
Section titled âWhat Not to Measureâ- Lines of codeâirrelevant and gameable
- Tool usage timeâusage â value
- Cost of AI toolsâmatters for ROI, not effectiveness
- Prompt countâmore prompts might mean learning
The Attribution Problem
Section titled âThe Attribution ProblemâWho gets credit for AI-generated code? Who takes blame?
Donât solve this. Treat agent-assisted code like any other. The human who committed it owns it.
This simplifies everything: no separate metrics, normal accountability, no need to track percentages.
Qualitative Signals
Section titled âQualitative SignalsâNumbers donât tell the whole story. Watch for:
- Team sentiment: Excitement or frustration? Positive talk about agents?
- Adoption patterns: Senior engineers using agents is a quality signal
- Knowledge sharing: Organic prompt sharing indicates value
- Problem selection: Engineers tackling harder problems is often the real win
Running an Experiment
Section titled âRunning an ExperimentâIf you need rigorous measurement:
- Control group: Some work happens without agents
- Clear metrics: Define before you start
- Time bound: 4-6 weeks to account for learning curves
- Survey participants: Qualitative data matters
But most teams donât need academic proofâjust signals that adoption is working.
The Real Question
Section titled âThe Real QuestionâDonât ask âAre agents making us more productive?â
Ask âAre we building what we need, at the quality we need, without burning out?â
If yes, your approach is working.
When Agents Help
Section titled âWhen Agents HelpâHigh-volume repetitive tasks
Section titled âHigh-volume repetitive tasksâTests for multiple functions, docs across files, API boilerplate, migration scripts. Same thing, many timesâagents thrive.
New territory exploration
Section titled âNew territory explorationâUnfamiliar framework? Agent scaffolds while you learn. New language? Get working examples. Unknown API? Generate integration code to understand patterns.
Clear spec, straightforward implementation
Section titled âClear spec, straightforward implementationâCRUD with defined schemas, form validation with known rules, utilities with well-defined I/O. Low ambiguity, well-understood problem space.
Tedious but necessary
Section titled âTedious but necessaryâMocks and fixtures, logging and error handling, consistent formatting, config updates across many places. Takes time but not thought.
When Agents Slow You Down
Section titled âWhen Agents Slow You DownâHigh-context tasks
Section titled âHigh-context tasksâIf understanding requires reading complex business logic, historical decisions, or unwritten conventionsâyouâd have to explain it all anyway. Often faster to just do it.
Tasks faster done manually
Section titled âTasks faster done manuallyâPrompting + waiting + reviewing > manual coding? Just code it. Especially true for single-line changes, familiar patterns, quick fixes.
Build intuition for your personal break-even point.
Novel algorithms
Section titled âNovel algorithmsâAgents pattern-match training data. New algorithmic approaches, domain-specific optimization, unusual data structuresâsolve it yourself, let agents help with boring parts around it.
Highly coupled changes
Section titled âHighly coupled changesâChanges touching many tightly-interdependent parts are hard for agents. They may not understand connections, errors compound, validation requires whole-system understanding. Break these apart or do manually.
Ambiguous requirements
Section titled âAmbiguous requirementsââMake it betterâ or âimprove performanceâ without specifics wastes cycles. Agents need clear success criteria, defined constraints, specific scope. If you canât articulate these, youâre not ready to delegate.
Team-Level Patterns
Section titled âTeam-Level PatternsâTask assignment: Donât assign agent-hostile tasks expecting agents will help.
Sprint planning: Donât assume agent help for all tasks. Call out which are agent-friendly. Account for validation overhead.
Retrospectives: Review where agents helped and hindered. What task types worked? Where did you waste time prompting?
Building Team Judgment
Section titled âBuilding Team Judgmentâ- Share examples: âThis task would have been faster manuallyâhereâs why.â
- Celebrate good choices: Acknowledge when someone correctly decides not to use an agent.
- Create a reference: Maintain a guide of task types and recommended approaches.
- Review periodically: As tools improve, patterns change.
Resources
Section titled âResourcesâEssential
Section titled âEssentialâ- Does AI Actually Boost Developer Productivity? â Yegor Denisov-Blanch, Stanford - 100k developer study: ~20% average boost, significant variance
- Stop Looking for AI Coding Spending Caps - Why caps cost more than they save
- ML-Enhanced Code Completion â Google Research - Googleâs productivity impact research
Deep dives
Section titled âDeep divesâ- The reality of AI-Assisted software engineering productivity - Balanced take on productivity claims
- Vibe coding is already dead - Critical perspective on when AI tools backfire