Shadow Agents Are an Architecture Problem
You are not building for the agents you have today. You are building for the agents you will have in two years. The numbers are not on your side.
The failure rate is not random. The 20% did the architectural work. The 80% did not.
One in five enterprise AI programmes is in production at scale. Four in five are not. RAND has it at 80%. IDC puts 88% of proofs-of-concept failing to graduate. MIT's NANDA report says 95%, on a methodology that has been challenged. Pick the most generous read. The ratio holds.
I will say something I have said before, because it matters more now than the last time I said it. Enterprise architecture and solution architecture have been in and out of fashion for fifteen years. Right now they are the work that decides whether AI lands on the P&L or joins the decommissioned-pilot pile. Building any structure without a blueprint, a foundation and structural review produces something that crumbles. The 20% built scaffolding, rigour and measurement. They iterated. They are not in production by accident.
Three signals from the last six weeks make this concrete.
Larry Ellison on Oracle's earnings call: every frontier model is trained on the same public internet. Same Wikipedia, same Reddit, same archive. Converging in quality, falling in price, eroding in differentiation. His word for them was commodities. Oracle moved FY26 capex from $35bn to $50bn behind that thesis. Backlog $523bn. The bet is not on which model wins. The bet is on the infrastructure that lets enterprises run secure reasoning across data nobody else can access.
This is the data architecture argument said out loud by the most senior public-markets voice in enterprise software.
The 20% have already concluded what Ellison just said publicly. They stopped running model selection as the strategic decision two cycles ago. Their AI strategy is not "which frontier model do we standardise on." It is "which of our proprietary data assets is most defensible, and how do we build a portable reasoning layer over it that lets us swap models out as they evolve." Model endpoints are abstracted behind an integration layer. Reasoning over governed data is the IP.
The 80% are still running model selection as the strategic decision. They are coupling specific models to specific workflows. Every model release forces an internal re-engineering exercise. They have built their AI estate on a layer that the vendors themselves now describe as a commodity.
The architectural difference is the difference between the model is the IP (the 80% position) and the data and the integration are the IP (the 20% position). Ellison just made that distinction inarguable.
4 May: Anthropic launched an enterprise services joint venture with Blackstone, Hellman & Friedman, and Goldman Sachs. Combined commitment $1.5bn. Built to do one thing: help mid-sized businesses deploy Claude across their core operations. 21 May: the venture acquired Fractional AI, pulling it out of an eleven-month partnership with OpenAI.
12 May: OpenAI launched the Deployment Company with more than $4bn in initial capital. TPG, Advent, Bain Capital, Brookfield, fifteen other partners. Same day, it acquired Tomoro, an applied AI consulting firm bringing 150 forward-deployed engineers.
Two frontier labs, in the same fortnight, both spent serious capital on the layer between their models and the businesses trying to use them.
This is not coincidence. It is the labs walking themselves to the same conclusion every honest operator has been reaching for the last twelve months. The model is not where the work happens. The integration is. The supervisory perimeter is. The governance is. The labs would rather own that work than rent it out to system integrators who do not move at their pace.
This is the solution architecture argument settling on a $5.5bn answer.
The 20% built that layer themselves. They have the operating layer between the frontier model and the regulated business, and they own it. The labs are now paying tens of millions to acquire the same capability the 20% have in-house. The 80% outsourced it to a system integrator on a multi-year engagement, or are waiting for the lab to ship it as a product. Both positions get worse as the labs accelerate. In twelve months the 80% will be renting the integration from the lab they bought their tokens from, on the lab's terms, and the lab will own the workflow context, the data exposure and the budget.
Anthropic launched Project Glasswing on 7 April 2026, built around an unreleased frontier model called Claude Mythos Preview. Thirty days of operation: more than 10,000 high or critical zero-day vulnerabilities flagged across every major operating system and web browser. The model generates working exploits on first attempt in more than 83% of cases. A 27-year-old vulnerability in OpenBSD. A 16-year-old vulnerability in FFmpeg in a line of code that five million automated test passes never caught. A 17-year-old remote code execution flaw in FreeBSD (CVE-2026-4747) that allows root on any machine running NFS.
The Glasswing partner list reads like a who's who of the industry. AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorgan, the Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks. IBM joined on 19 May. ENISA, the EU cybersecurity agency, joined on 1 June, the first EU institution and the first partner outside the US-UK axis. The Federal Reserve and Treasury convened bank CEOs on this in April. The IMF has flagged Mythos-class AI as a systemic risk.
The implication is simple. The moat Ellison described, proprietary data and defensible enterprise context, is only a moat if it has been assessed. If a frontier model can find a 27-year-old vulnerability in OpenBSD in a fortnight, and generate a working exploit on first attempt 83% of the time, the supervisory perimeter around your AI estate has to be a serious piece of engineering. Not an afterthought. Not a 2023 control framework with 2026 agents on top.
The 20% have mapped where AI is making or shaping decisions in their business, sanctioned and unsanctioned. They have identity at the agent level, not just at the user level. They have policy enforcement at the tool-call boundary. This is what MCP gateways like Natoma exist to provide, which is exactly why Snowflake acquired Natoma on 27 May. They have audit trails the regulator can read. They have kill-switches that work.
The 80% have a 2023 security architecture and a 2026 attack surface. Writer's 2026 enterprise AI survey has 67% of executives admitting their company has already suffered a data leak via an unapproved AI tool. 35% admit they could not immediately "pull the plug" on a rogue agent. Those are not edge cases. They are the default condition.
The PRA, the Bank of England and the FCA will move on Mythos-class supervisory questions before the end of Q3, and now with EU supervisory peers that have the same tooling. The supervisory shift is no longer hypothetical, no longer transatlantic-asymmetric, no longer two years away.
The fourth signal does not come from a press release. It comes from the pattern that shows up every time you look closely at an AI programme that landed.
The 20% redesigned the workflow before they deployed the AI. They picked the workflows where outcome was measurable. They built the supervision layer before they shipped. They aligned the KPI to a business result, not a deployment milestone. The agent does not sit on top of the existing process. The process was reshaped around what the agent makes possible.
The 80% deployed AI into a 2019 org chart, ran the cost lever, and called the result transformation. Gartner has forecast that by 2027, 40% of enterprise AI agents will be demoted or decommissioned over governance gaps identified only after production incidents. The 80% are most of that 40%.
This is the process architecture argument. "Redesigning the workflow" is not a slogan. It is a sequence of decisions. Where the agent sits. What it owns. What it asks a human. What KPI moves when it works. What KPI moves when it fails. Who supervises. Who gets paged. The 20% have those questions answered before the model goes in. The 80% are answering them after the incident.
This is the assessment. Four questions, four yes answers with evidence, and you are in the 20%. Anything less and you are not.
(Data architecture.) If your AI strategy still rests on standardising on a specific frontier vendor, you are exposed to the wrong variable. The 20% built a portable reasoning layer over governed data. The model is replaceable. The data and the integration are not.
(Solution architecture.) The integration layer is the operating layer between the frontier model and the regulated business. It enforces context, governance, identity, audit. The 20% own it. The labs are now paying nine figures to acquire it. The 80% are about to rent it back from the lab they bought their tokens from, on the lab's terms.
(Security architecture.) The Mythos numbers and the Writer 67% are the floor. If you cannot map your AI estate today (agents, models, tools, permissions) you cannot defend it tomorrow. The 20% have the map. The 80% have a 2023 security policy and a hope.
(Process architecture.) If you put a frontier model into a 2019 workflow and ran the cost lever, you are in the 80%. The redesign is the work. The model is the easy part.
We do this work in production environments under ISO 27001, ISO 42001 and SOC 2. AI in the loop, humans in the loop, both clearly defined, neither dominant. Real audit trails, real governance, real KPI movement that survives an audit committee meeting.
We know what the 20% looks like because that is the work we have been doing inside live businesses for the past eighteen months. The four questions above are not theoretical. They are the four questions we work on every week with clients who are already in production at scale and others who are sequencing themselves into it.
The honest path is an assessment, not another pilot. Two to three weeks of defined, scoped work. A map of where true value lives in your business: the data, the domain, the operational knowledge that is defensible because nobody else has it. A map of the AI estate currently making or shaping decisions, sanctioned and unsanctioned. A map of the supervisory gap. A sequenced recommendation of what to do, in what order, to close it.
The deliverable is something the board can act on this quarter, not a deck. Production-grade. Governance-grade. Sequenced for the FY27 budget review.
The Bank of England, the PRA and the FCA will move on Mythos-class supervisory questions before the end of Q3. The EU AI Act's full GPAI enforcement powers go live on 2 August. ENISA now has access to the same supervisory tooling that the labs and the largest US banks have. The FY27 budget reviews start in October. The window to do this work under your own steam, rather than under regulatory pressure or after a production incident, is the one we are in right now.
The labs are no longer waiting. JPMorgan, Microsoft and Salesforce, among others, have already moved agentic AI from R&D into core operating cost. The reclassification is the public declaration that the work is no longer experimental. It is now governed, funded, audited and reported on like any other operating activity.
Being in the 20% is not a destination. It is a discipline. The work is specific, the sequence matters, and the window to do it under your own steam is open right now.
If you cannot say with evidence that your business is in the 20%, the next conversation is the assessment, not another pilot. Be the One. We are running a small number of these over the coming quarter. If that is the conversation you want to have, come alongside.
Our structured assessment benchmarks your organisation across five pillars and provides a clear roadmap.
Justin Gane · CEO, 1Digit
Founder and CEO of 1Digit. Builds enterprise AI architecture and data platforms for regulated industries across the UK and Europe.