Both — and which one happens on your team depends on a single variable: whether the work has a known right answer. On bounded tasks with a clear standard of “good,” AI compresses the gap, pulling weaker performers up toward a ceiling the strong had already reached. On open-ended tasks that turn on judgment, AI widens the gap, because strong performers use it as leverage while weaker ones cannot tell good output from bad. The same tool equalizes and stratifies at once. The kind of work decides which.
When does AI close the gap between strong and weak performers?
When the task has a transferable performance frontier — a known ceiling that defines what “excellent” looks like and can be reached by following it. Drafting a standard contract clause, writing clean boilerplate code, summarizing a document, producing a competent first-pass financial model: each has a recognizable right answer.
Here AI is an equalizer. The weaker performer, who used to fall well short of the standard, now arrives near it with the model’s help. The stronger performer was already at or near the ceiling, so the assist adds little. The distance between them shrinks. This is the result most early studies captured, and it is the source of the optimistic “AI democratizes expertise” narrative.
That narrative is true — but only inside this regime.
When does AI widen the gap?
When the task has no fixed ceiling and quality is a matter of judgment, framing, taste, or strategic choice — deciding which problem to solve, what a client actually needs, which of three defensible strategies fits the situation, or what the model got subtly wrong.
Here AI is an amplifier. The stronger performer uses it as leverage: they direct it well, catch its errors, and push past what they could do alone. The weaker performer cannot evaluate the output — they lack the judgment that would tell them whether the confident, fluent answer is actually right — so they accept it, plateau, or are actively misled. The gap does not merely persist; it grows.
This is the regime executives systematically underestimate, because on the surface it looks identical to the first one. The work still gets “done.” The output still looks polished. What changed is invisible: the weaker performer’s ceiling rose far less than the stronger performer’s did.
Why does the same tool do both?
Because AI raises the floor of production but not the floor of evaluation. It can generate a competent draft for almost anyone. It cannot give anyone the judgment to know whether that draft is right for a situation with no template.
On bounded tasks, evaluation barely matters — the standard is external and known, so production is the whole game, and AI compresses. On open tasks, evaluation is the game — and evaluation is exactly the capability AI does not transfer. So the people who already had it pull away.
The sign of the effect flips with the task. That is the entire boundary condition, and missing it is how organizations draw the wrong lesson from the right data.
What should this change about how you deploy AI?
Stop asking “will AI make my people more productive?” Start asking “what kind of work am I measuring?”
- On bounded, standardized work — treat AI as a leveler. Expect the gap to narrow, raise the baseline expectation for everyone, and redeploy your strongest people off work the tool now commoditizes.
- On judgment-heavy, open-ended work — treat AI as an advantage multiplier for your best people and a risk with your weakest. The danger is not that weaker performers produce less; it is that they produce confident, fluent, wrong work that nobody catches. Pair AI here with stronger evaluation, not just higher output.
- Promotion and hiring — the premium shifts toward people who can evaluate, not merely produce. In an amplifying regime, judgment is the scarce asset, and it is the one capability AI does not distribute.
The blanket question — does AI help or hurt? — has no answer. The useful question is always: in this regime, for this task, which way does it cut?
Frequently asked questions
Does AI help weak performers more than strong ones?
On tasks with a known right answer, yes — AI pulls weaker performers toward a standard the strong had already reached, narrowing the gap. On open-ended, judgment-based tasks, no — strong performers gain more, because they can direct and correct the tool while weaker performers cannot reliably evaluate its output.
Will AI eliminate the need for expertise?
No. It lowers the value of routine production and raises the value of judgment. On any task without a fixed right answer, the ability to evaluate AI’s output becomes the scarce, decisive skill.
Does AI increase or decrease inequality at work?
Both, depending on the mix of work. A role dominated by standardized tasks tends to equalize; a role dominated by open-ended judgment tends to stratify. Most real jobs are a blend, so the net effect depends on which kind of work carries the weight.
How should managers respond?
Classify work by whether it has a known standard. Deploy AI as a leveler on standardized work and as a supervised multiplier on judgment work — pairing it with stronger evaluation wherever weak output is hardest to detect.
Leave a comment