May 9, 2026 · 6 min read

What the year of vibes actually changed: vibe coding is a real productivity shift, not a discourse moment.

TL;DR [show]

Karpathy's February 2025 'shower of thoughts' about vibe coding kicked off a year of trade-press oscillation between 'transformative' and 'cope.' The honest read sits between the two. Vibe coding produced a real productivity shift on three measurable axes (developer expectations of agent output; prototyping cycle time; the modal output of new junior developers). Vibe coding produced no shift on three other axes that matter (production-grade output quality; multi-engineer codebase coordination; the long-cycle decisions that determine what gets built). A futurist read on what survives the discourse, ADR-0031 applies.

What the year of vibes actually changed: vibe coding is a real productivity shift, not a discourse moment — by Thomas Jankowski, aided by AI — *Productivity on three axes, stasis on three others*— TJ x AI

The trade-press cycle on vibe coding ran exactly the shape every category-of-the-year cycle runs. Karpathy posted his shower-of-thoughts riff in early February 2025 and the four-act discourse arc compressed into eight weeks. Act one: vibe coding is the future of software, the IDE is dead, the senior engineer is obsolete. Act two: vibe coding is irresponsible, the technical debt is going to be enormous, juniors will never learn the fundamentals. Act three: walkback. The serious operators rename it agentic engineering, dial down the rhetoric, and move on. Act four: the trade press writes the post-mortem about how the discourse got out of hand.

By March, with two months to evaluate the actual production-codebase consequences, the read that survives sits between the two early extremes. Vibe coding is a real productivity shift on three measurable axes. Vibe coding is not a productivity shift on three other axes that matter as much or more. The futurist read is which axes are which, and what that implies about where the next 12-18 months of operator-class engineering practice lands.

This piece is a futurist essay in the carve-out sense of [ADR-0031](decisions/0031-futurist-register-carveout.md). Forward-looking, conditional, register-loaded. Three things changed. Three things did not. Read the columns flat against each other.

What changed: developer expectations of agent output

The first measurable shift is in what a senior engineer expects an LLM coding assistant to produce on a single turn. The 2024 expectation was a 5-20 line snippet, contextually correct, requiring 1-3 follow-up edits before it merged. The 2025 expectation, post-vibe-coding-cycle, is a 100-500 line file, structurally correct, requiring 1-3 review-class edits before it merges. The shift is roughly an order of magnitude in the unit of useful agent output.

This is durable. The expectation does not retreat. Engineers who calibrated their working model to the 2025 expectation produce more software per unit of engineering time than engineers calibrated to the 2024 expectation. The productivity delta is real and roughly 2-3x on the kinds of work where the agent-output unit of useful work is the operator's bottleneck (greenfield prototyping, internal tooling, scaffolding, write-once-read-many infrastructure).

The shift is not equal across all engineering work, but it is real on the slice it covers, and that slice is non-trivial.

What changed: prototyping cycle time

The second measurable shift is the cycle time on a working prototype. The 2024 cycle for a non-trivial prototype (a working CRUD app with auth, a small ML pipeline with eval, a data-tool with a UI) ran 3-7 days for a senior engineer working alone. The 2025 cycle for the same prototype runs 3-7 hours.

This is the kind of compression that changes the question of what to prototype. When the prototype takes a day, the senior engineer prototypes the few ideas that look promising. When the prototype takes an hour, the senior engineer prototypes the broader fan-out of ideas, including the ones that look unpromising but might surprise. The exploration tree gets wider.

Operationally, the cycle-time compression most benefits teams whose work is exploratory (early-stage product builds; internal-tooling discovery; research-engineering at AI labs; one-off analysis that does not need to live in a maintained codebase). Teams whose work is execution against a known spec (a feature in a maintained product; an integration with a known partner) see less compression because the spec-and-review work, not the implementation work, was already the bottleneck.

This shift is also durable, and it changes how the operator-level thinks about the value of an idea-that-might-not-work.

The third measurable shift is in what new junior developers in 2025 write as their default output. The 2024 junior wrote handwritten code, increasingly with autocomplete-class assistance, with the senior reviewer correcting style, structure, and idiom. The 2025 junior writes a prompt-and-spec, runs the agent, reviews the agent's output, and submits the agent's output (with edits) for review. The senior reviewer's job has shifted from style-and-structure to spec-quality and review-quality.

This is a real shift, and the trade-press worry that juniors won't learn fundamentals is partially right and partially missing the point. Juniors learn fewer of the syntactic and idiom-class fundamentals because the agent does that work. Juniors learn more of the spec-writing, review-quality, and decomposition fundamentals because those are now the operator differentiators. The junior's value proposition shifts. The senior reviewer adapts the mentoring approach.

The shift will calibrate over the next 12-24 months. The question of whether the new junior modal output produces senior engineers as good as the old junior modal output produced is empirically unsettled. The pessimist read is that it does not. The optimist read is that it produces senior engineers with different and arguably more valuable skills. The honest read is that we will know in 5-7 years and the operator class should plan for both possibilities.

What did not change: production-grade output quality

The first non-shift is on the production-grade slice of the work. The agent-output unit of useful work is real on prototyping, scaffolding, internal tooling, write-once-read-many infrastructure. It is not real, in 2025, on production-grade output quality.

By production-grade I mean: code that ships to paying users, runs at non-trivial scale, has uptime and latency commitments, integrates with multi-team coordination, and lives for years in a maintained codebase. The agent-output unit on that slice is closer to the 2024 expectation than the 2025 vibe-coding expectation. The senior engineer reviewing agent output for production-grade work still does the structural review the senior engineer did in 2024. The agent saves implementation time on subroutines and well-scoped tasks within the work; it does not save the structural-review-and-integration time that determines whether the production-grade output is actually production-grade.

This is durable through 2026 and likely well beyond. The bottleneck on production-grade output is engineering judgment under coordination pressure, not implementation throughput. The vibe-coding cycle did not move the engineering-judgment bottleneck.

What did not change: multi-engineer codebase coordination

The second non-shift is on the coordination layer between engineers in a non-trivial codebase. A team of 5-15 engineers working in a shared codebase has coordination cost that scales with the number of engineers, the complexity of the shared abstractions, and the rate of change in the codebase. The vibe-coding shift increases the rate of change (engineers ship more code per unit time) without changing the coordination layer's capacity. The result is more coordination cost, not less.

In practice this shows up as: more PR review queue depth, more merge-conflict surface area, more architectural drift between engineers working on adjacent areas, and more integration-test failures driven by underspecified inter-component contracts. The teams that handled this well in 2024 are handling it about as well in 2025, with the volume of coordination work scaled up. The teams that handled it badly in 2024 are now drowning.

This non-shift is the dominant 2026-2027 operator-grade problem in any team beyond the small-cohort scale. The productivity gains from vibe coding are real per-engineer, but the team-scale productivity is constrained by coordination, and coordination has not been vibe-coded.

What did not change: the long-cycle decisions

The third non-shift is the most consequential. The long-cycle decisions in software (what to build, who to build it for, how to differentiate, what the operator's strategic moat is) were not made faster, more accurately, or differently because of vibe coding. The decisions are made by the same operators against the same evidence base, with marginal-class assistance from agent-driven research and analysis tooling.

This matters because the long-cycle decisions determine the value of the productivity gains. A team that vibe-codes their way to ten-times-more-prototypes shipped is not operating-coherent if the prototypes are calibrated to the wrong product strategy. The operator question of what to build did not get easier in 2025. The operator-grade question of how to build it did. The two questions are not the same.

The implication is that the operator advantage in 2026-2027 sits with the operators whose long-cycle judgment is good and whose execution speed is now compounded by the vibe-coding shift. The operators whose long-cycle judgment is poor will produce more output faster against the wrong strategy, which is structurally a worse position than producing less output slower against the wrong strategy. The faster wrong-direction operators waste capital faster.

The futurist read

Through 2026-2028 the operating-class engineering practice settles into a stable shape that integrates the three shifts and routes around the three non-shifts. Senior engineers run agent-driven prototyping for the exploration phase of work, switch into structural-review and coordination mode for the production-grade phase, and apply the same long-cycle judgment they always did to the question of what to build. Junior engineers calibrate around the new modal output and the senior-mentoring shape that the new modal output requires. Coordination tooling improves slowly, with the gains coming from better PR-review tools, more rigorous spec-writing practice, and stronger architectural-discipline enforcement. The long-cycle decisions stay with humans for the foreseeable future, and the operators whose long-cycle judgment is good run ahead of the operators whose long-cycle judgment is not.

The discourse called the year of vibes a transformative shift. The walkback called it cope. The futurist read is that it was real on three axes that matter and irrelevant on three other axes that matter as much or more. The operators who ran with the discourse moved fast in directions that the long-cycle judgment did not validate, and that is the operator-grade lesson the next cycle will have to absorb.

The vibes shift is real. It is also not the operator story the trade press told it as. The actual category-of-2025-2027 lesson is the one neither extreme of the discourse made room for.

—TJ