February 25, 2025 · updated May 9, 2026 · 3 min read

The 'multi-agent' marketing flag appears whenever Shopify-style memos run ahead of actual deployments.

The multi-agent marketing flag started appearing in vendor decks through late 2024 and accelerated into early 2025 after the Tobi Lütke memo at Shopify made the AI-usage question a CEO-level talking point at every public company. The flag is the term "multi-agent" deployed in the product positioning to suggest a more sophisticated AI architecture than the underlying engineering actually represents. The diagnostic test for whether the flag is real is the orchestration primitive. If the vendor cannot describe one, the deck is selling a single agent with subroutines under multi-agent positioning, and the buyer is being sold.

The orchestration primitive is the engineering-class abstraction that defines how the multiple agents coordinate. The orchestration primitive answers questions like: how does agent-A determine that agent-B should be invoked, what state is shared between them, what mechanism resolves conflicts when their outputs disagree, what handoff infrastructure carries context between them, what error-recovery happens when one of them fails. Real multi-agent systems have explicit, named answers to these questions. Marketing-multi-agent systems have hand-waved answers that fall apart on careful inquiry.

Several common patterns surface when the diagnostic test is applied. The vendor describes a workflow where one model calls another model with a different prompt as if that constitutes multi-agent. It does not. The vendor describes a workflow where one agent decomposes a task into sub-tasks and assigns them to specialty agents the vendor created, with no actual coordination infrastructure between the specialty agents. That is task-decomposition, not multi-agent. The vendor describes a workflow where the user can choose between several agents the vendor offers. That is multi-tenant single-agent, not multi-agent.

The cases where the multi-agent term legitimately applies are narrower than the marketing usage suggests. They include systems where multiple agents have genuinely independent state and coordinate through an explicit communication protocol. Systems where multiple agents represent different stakeholder perspectives in a negotiation or evaluation context, with the orchestration primitive arbitrating between them. Systems where multiple agents operate across organizational boundaries and the orchestration primitive handles the cross-organization coordination. These cases exist and are operationally meaningful. They are also rarer than the marketing usage implies.

The implication for the buyer-class evaluating multi-agent vendor pitches is straightforward. Ask the vendor to describe the orchestration primitive. The vendor that has a clear answer (with named components, documented behavior, audit-and-trace infrastructure for the orchestration layer) is in the small subset of vendors actually shipping multi-agent systems. The vendor that does not have a clear answer is using the multi-agent term as marketing positioning, and the buyer should evaluate the underlying system as a single-agent-with-subroutines architecture rather than as the more sophisticated multi-agent architecture the marketing implies.

The diligence value of the diagnostic test is the same shape as the retraining-cycle question discussed elsewhere. The test filters the vendor population fast. Vendors who pass demonstrate engineering-class maturity that translates into operational reliability. Vendors who fail demonstrate marketing-class positioning that translates into deployment risk.

The flag became prominent because the discourse pressure from CEO-level AI memos created demand for vendor positioning that felt sufficiently advanced. Multi-agent was one of the terms the marketing class reached for to fill the demand. The term's actual meaning in engineering practice is narrower than the marketing usage. The diligence test surfaces the gap. Apply it. Allocate evaluation time accordingly.

The Shopify memo and its peers will continue to produce CEO-level pressure on AI capability claims. Vendors will continue to reach for whatever terminology fills the pressure. Multi-agent is one term; agentic AI is another; autonomous AI is a third. Each one has the same diagnostic shape: ask for the engineering primitive that the term describes, and evaluate the vendor against the answer rather than against the marketing presentation.

—TJ