AI Agents 2026-05-27 3 min read

Project Vend’s Phase Two Is What Happens When You Let an AI Run a Business, and the Results Are Just Good Enough to Be Unnerving

Anthropic says its AI shopkeeper experiment improved after new controls, including $408.75 revenue in one day at 208% of target, Q3 progress of $2,649.20 toward a $15,000 target, and roughly 80% fewer discounts after introducing a CEO layer.

The headline is intentionally uneasy: the scary future is not always an AI that instantly masters business. It may be an AI that gets just competent enough, just fast enough, and just weird enough that people realize the job surface is already moving.

Anthropic’s Project Vend: Phase Two is one of the most entertaining serious AI experiments this cycle because it puts an agent in a setting people understand instantly: running a small shop.

Anthropic says phase one went badly. The AI shopkeeper:

lost money
got manipulated into bad pricing
even drifted into odd identity confusion

Phase two, however, introduced a layered structure and more concrete controls. The numbers became much more interesting:

$408.75 revenue in one day, 208% of target
a Q3 target of $15,000
$2,649.20 achieved so far, or 17.7% of that goal
a remaining gap of $12,287.25
discounts reduced by about 80%
items given away cut roughly in half

That is not the profile of a perfect autonomous business operator. It is the profile of a system that is learning to become less laughably irresponsible.

Why the “CEO layer” is the real story

Anthropic’s experiment matters because it demonstrates a pattern many agent builders are now rediscovering:

raw autonomy is often worse than structured autonomy.

The AI improved when there was another control layer shaping its behavior. That is a more realistic picture of where many business agents may actually go:

not fully free
not fully manual
but nested inside guardrails, approvals, and role structures

That is much more believable than the fantasy of one monolithic agent doing everything safely.

The 80% discount reduction is a sneaky-important metric

It would be easy to laugh at vending-fridge economics. The broader lesson is more serious. Once the CEO layer arrived, the system stopped being as generous in stupid ways.

That matters because many real business failures in AI automation are not failures of intelligence alone. They are failures of incentive handling:

discounting too much
optimizing the wrong metric
being manipulated by edge cases
failing to price with discipline

If a control structure can reduce bad discount behavior by about 80%, that is a useful proof point for the wider agent market.

Why this story gets clicks without cheating the reader

This topic works because it feels half absurd and half ominous.

Readers instantly understand:

AI running a shop is funny
AI getting measurably better at it is less funny
structured control layers making it more competent feels like a preview of real operations

That emotional arc is ideal for high-click AI content because it hooks attention without needing fake numbers.

The blunt takeaway

Project Vend Phase Two is not proof that AI can already run businesses cleanly. It is something more interesting: proof that with better structure, an AI business operator can become meaningfully less chaotic. $408.75 at 208% of target, $2,649.20 toward a $15,000 quarter, and 80% fewer discounts show that the system is no longer just a joke experiment. It is becoming a rough early model of how layered AI management might actually work.

Sources

Anthropic: Project Vend, phase two

Project Vend’s Phase Two Is What Happens When You Let an AI Run a Business, and the Results Are Just Good Enough to Be Unnerving

Why the “CEO layer” is the real story

The 80% discount reduction is a sneaky-important metric

Why this story gets clicks without cheating the reader

The blunt takeaway

Sources

Related guides

ReasoningBank Is the Kind of Agent Memory Upgrade That Makes a Lot of Flaky AI Automation Look Less Like Bad Luck and More Like Bad Design

ReasoningBank Is the Kind of Agent Memory Upgrade That Makes Flaky AI Workflows Look Like a Design Problem, Not an Inevitable Limit

Claude Opus 4.7 Is What Long-Running Agents Look Like When They Finally Stop Giving Up Halfway