AI Coding 2026-05-26 4 min read

Jules Getting 140,000 Code Improvements Is the Kind of Number That Makes Backlog Work Look Fragile

Google’s Jules is no longer just a charming coding-agent experiment. With 140,000 public code improvements, Gemini 2.5 Pro underneath, and 5x to 20x higher limits, it looks a lot more like an async engineering labor layer.

The high-click version writes itself: once an asynchronous coding agent starts shipping six-figure code improvements and higher-scale paid tiers, the old “agents are still early” comfort blanket starts looking thin.

Google’s Jules story got more serious in 2026 than a lot of people are admitting.

The easy headline is “Jules is out of beta.” The more important signal is the combination of:

real public task volume
higher usage tiers
stronger underlying reasoning
a cleaner async workflow story

That combination matters because it starts changing how teams think about background engineering work.

The number that should make people stop smirking

Google says that during Jules’ beta period:

thousands of developers used it
they tackled tens of thousands of tasks
that resulted in more than 140,000 code improvements shared publicly

That is not a toy number.

It does not mean every improvement was profound.

It does mean the product crossed out of “cute agent demo” territory and into “this is a real volume surface” territory.

Volume changes how you evaluate these systems.

At low volume, you can dismiss them as PR.

At higher volume, you have to ask which kinds of engineering work are quietly becoming delegatable.

Why the tiering is the real market signal

Google is also introducing higher usage tiers:

introductory access for basic usage
Google AI Pro: 5x higher limits
Google AI Ultra: 20x higher limits

That tells you something very simple:

Google thinks some users are ready to make this part of their daily workflow, and some are ready to use it at serious scale.

This is how a product moves from “interesting capability” to “revenue-bearing habit.”

The moment vendors start packaging higher limits for intensive multi-agent workflows, they are no longer only selling curiosity. They are selling throughput.

Throughput is what threatens headcount assumptions around repetitive engineering work.

Why Gemini 2.5 Pro underneath matters

Google says Jules now uses the advanced thinking capabilities of Gemini 2.5 Pro to develop coding plans, producing higher-quality outputs.

That is important because coding agents do not fail only on raw code generation. They fail when they misunderstand the work, scope the task badly, or lose the thread while crossing files and tooling.

Better planning changes:

how often the task is scoped correctly
how often the changes are reviewable
how much supervision the user has to spend just to get back to sane

If the planning layer gets better, the whole async-agent idea gets more plausible.

Why asynchronous work is the scary part

Autocomplete changes how you type.

Async coding agents change how you allocate attention.

Jules is explicitly built around:

cloning the repo into a secure cloud VM
understanding project context
working in the background
returning with a plan, reasoning, and a diff

That matters because it attacks a class of engineering work people hate but tolerate:

version bumps
small bug fixes
writing tests
repetitive maintenance
ticket cleanup

When that work can run off to the side while you do something else, the whole unit of “developer effort” starts looking different.

Why this is bad news for weak coding wrappers

Jules pressures a specific kind of product:

coding assistants that mainly sit in the editor
products with thin repo-level reasoning
wrappers whose moat is “Google does not have this natively”

That moat gets shakier when Google has:

the model
the cloud runtime
the consumer distribution layer
increasingly credible agent behavior

That does not mean Jules wins everything.

It does mean the bar for competing got higher again.

The hidden risk teams should take seriously

The more these systems improve, the less the bottleneck is typing and the more it is:

task framing
permissions
code review quality
integration discipline
organizational trust

That is why stronger agents do not eliminate good developers.

They punish sloppy teams.

If your repo discipline is weak, your issue hygiene is bad, and your review culture is lazy, async agents do not magically save you. They amplify the mess faster.

The blunt takeaway

Jules matters because it is no longer selling the fantasy of autonomous coding in the abstract. It has six-figure visible output volume, higher-tier usage packaging, and stronger reasoning underneath. That is exactly the kind of combination that turns “interesting beta” into a real workflow contender.

The backlog work many teams still treat as human default is starting to look more exposed than comfortable people want to admit.

Jules Getting 140,000 Code Improvements Is the Kind of Number That Makes Backlog Work Look Fragile

The number that should make people stop smirking

Why the tiering is the real market signal

Why Gemini 2.5 Pro underneath matters

Why asynchronous work is the scary part

Why this is bad news for weak coding wrappers

The hidden risk teams should take seriously

The blunt takeaway

Sources

Related guides

GPT‑5.3‑Codex Is the First Coding Model That Forces Security Teams Into the Conversation

Claude Sonnet 4.6 Is the Upgrade That Makes a Lot of Premium Coding AI Spend Look Sloppy

Codex Growing From 3 Million to 4 Million Weekly Developers in Two Weeks Is the Kind of Curve That Should Make Engineering Leaders Pay Attention