Skip to content
    Back to writing
    March 31, 2025 · updated May 9, 2026 · 6 min read

    The demo video died when Google AI Overviews recommended eating rocks. Here is what replaced it.

    The demo video died when Google AI Overviews recommended eating rocks. Here is what replaced it — by Thomas Jankowski, aided by AI
    Demo-to-production gap made visible— TJ x AI

    The cinematic-cut AI demo video became the default vendor-positioning artifact through 2023-2024. The aesthetic was consistent across vendors: a polished narrator, a well-lit interface shot, a small set of carefully-curated example queries, response generation cut to feel real-time, and a closing call-to-action that emphasized capability without committing to operational reliability claims. The trade-press coverage often embedded the demo videos as primary evaluation evidence, with the demos serving as the implicit proof of vendor capability.

    The aesthetic started losing credibility through 2024-2025 as the gap between demo capability and production reliability became unignorable across multiple visible incidents. Google's AI Overviews recommending users eat rocks, put glue on pizza, and other unsafe-or-incorrect outputs in mid-2024 was the visible inflection point because the company's reputation and the production-deployment scale made the gap impossible to dismiss as a one-off issue. The McDonald's drive-thru AI walkback, the Klarna customer-service rebuild, and the various OTA-concierge-AI repositioning events through 2024-2025 reinforced the pattern. By late 2025 the operator-class buyers had substantially stopped trusting cinematic-cut demo videos as evidence of production capability.

    The replacement standard that emerged is the live-transcript-with-timestamps standard. This essay walks what changed, why it changed, and what the new standard looks like for vendors that want to be taken seriously by the post-2025 operator-level buyer.

    Why the cinematic cut stopped working

    The cinematic cut worked through 2022-2023 because the underlying model capability was genuinely improving fast enough that the curated demo represented a plausible projection of where the production deployment would land within months. The buyer evaluating a demo could reasonably bridge the demo-to-production gap with the assumption that the model trajectory was approaching the demo capability.

    The trajectory through 2024 was different. The model capability continued improving but the gap between the curated-demo capability and the production-deployment capability widened rather than narrowed. The cinematic-cut demo was no longer a plausible projection of production reliability; it was an aspirational presentation that the actual deployment would not match. Buyers who relied on the demos as evaluation evidence were systematically over-paying for capability that the production deployment did not deliver.

    The eating-rocks incident at Google was the inflection point because it surfaced the gap publicly at a vendor whose engineering reputation should have been able to close it. If Google's AI Overviews could ship outputs of that shape at production scale, the buyer-class read on the demo-vs-production gap had to be revised across the broader vendor landscape. Buyers had been giving cinematic-cut demos benefit-of-the-doubt; the inflection point removed the benefit.

    What replaced it

    The live-transcript-with-timestamps standard that emerged through late 2024 and into 2025 has several visible characteristics.

    The vendor presents production-deployment evidence rather than curated-demo evidence. The presentation includes timestamped transcripts of real production interactions, with the interactions selected from a recent window (typically the last 30-90 days), with a sufficient sample size to demonstrate behavior across the realistic input distribution, and with the failure cases included in the sample alongside the success cases.

    The vendor presents the operational-metrics infrastructure that supports the deployment. The infrastructure includes the eval-deltas process, the regression-detection mechanism, the rollback-and-remediation workflow, the audit-trail for the production interactions, the human-review-and-correction workflow that addresses the failure cases. The buyer can evaluate the vendor's operational-maturity through the depth and discipline of the supporting infrastructure rather than through the polished surface of the demo.

    The vendor commits to performance reporting against the buyer's specific use case after deployment. The commitment includes the performance metrics that will be tracked, the cadence of reporting, the threshold at which the vendor will trigger remediation work, and the contractual implications of sustained underperformance against the threshold. The post-deployment performance commitment replaces the pre-deployment capability claim as the primary basis of evaluation.

    What this implies for vendor positioning

    For AI vendors selling into the post-2025 operator buyer, the practical advice is to retire the cinematic-cut demo as the primary positioning artifact and replace it with the live-transcript-with-timestamps standard. The retirement is structurally hard for several reasons. The cinematic cut was a comfortable artifact for the marketing-and-sales class, with predictable production cost and well-understood narrative beats. The live-transcript standard is harder to produce, less predictable in narrative shape, and requires the vendor to be substantially more honest about the operational reality of the deployment.

    The vendors that have made the transition early are reporting better post-meeting buyer engagement, longer sales cycles with more sophisticated diligence, and higher win rates on the deals that make it through the diligence cycle. The vendors that have continued to lead with cinematic-cut demos are reporting flatter buyer engagement and harder time getting through diligence. The shift is operational and visible in the conversion data the vendor-class tracks.

    For the trade-press class, the implication is that demo videos should be treated with substantially less weight as evaluation evidence in vendor coverage. The trade press that continues to embed demo videos as primary evidence is presenting evaluation that the post-2025 buyer-class has stopped accepting. The shift suggests the trade press should request live-transcript evidence as the basis of vendor coverage, with the demo video as supplementary material rather than primary evidence.

    For the buyer-class, the implication is to ask for live-transcript evidence on every vendor evaluation and to evaluate the vendor's response to the request as part of the diligence process. Vendors who can produce the evidence cleanly demonstrate operational maturity that the diligence process should reward. Vendors who cannot or who try to substitute curated examples are signaling deployment-risk that the diligence process should price.

    The eating-rocks moment at Google did not single-handedly kill the demo video, but it surfaced the structural gap that the cinematic cut had been obscuring. The post-2025 standard is harder for vendors to produce and substantially better for the buyer-class to evaluate against. The shift is durable. Vendors who recognize the shift and adapt early will close the deals the lagging vendors will lose.

    —TJ