$5.5 billion in 30 days. That is what OpenAI and Anthropic raised in early May 2026 to start consulting companies, and if your firm deploys their models into client work, those companies compete directly with you. OpenAI pulled in more than $4 billion from 19 investors (TPG, Brookfield, Advent, Bain, SoftBank) to launch a majority-owned venture called The Deployment Company, and guaranteed those backers a 17.5% annual return over five years. Anthropic broke the same week with roughly $1.5 billion from Blackstone, Goldman Sachs, and Hellman & Friedman. The pitch for both: embed engineers inside client organizations and redesign their workflows around agents. That is not a model roadmap. That is the Deloitte playbook, funded by the firm you are paying to run your inference.
The exposure is not theoretical, and it is not slow-moving. The receipts are public, the capital is committed, and the mechanism by which a services firm funds its own replacement is already running every time it calls an API.
The investor list is the tell
Read the cap tables and the strategy writes itself. TPG and Brookfield do not write venture checks hoping a model gets better; they own distribution and route it. Blackstone and Hellman & Friedman do not fund chatbots; they sit on portfolios of mid-market companies that need exactly the kind of workflow redesign these ventures promise to sell. Anthropic's venture is explicitly aimed at the private-equity firms' own portfolio companies, a captive book of business piped straight into the lab's services arm.
OpenAI made the intent unambiguous by buying the capability rather than building it slowly: the venture is acquiring Tomoro, a London consultancy with a client roster that already includes Mattel, Red Bull, and Virgin Atlantic. The labs are not extending their models toward services. They are acquiring services firms and pointing private-equity distribution at them. This is implementation capacity chasing a market, not capital chasing better inference.
The labs have to move down-market because the model isn't the moat
If model quality were the durable advantage, the labs would not need a consulting arm at all. But the enterprise market already proved that quality alone does not hold share. OpenAI's enterprise LLM API share fell from 50% in late 2023 to 25% by mid-2025, according to Menlo Ventures' market update; Anthropic now leads at 32%, with Google's Gemini climbing to 20%. Brand and first-mover status evaporated the moment real production workloads started shopping on performance and price.
When the model commoditizes, margin has to come from somewhere else, and the somewhere else is the messy, high-touch work of making AI actually function inside a real organization. That work is genuinely hard: 88% of organizations reported confirmed or suspected AI agent security incidents in the past year, per Gravitee's security survey. Deployments stall, agents exceed their permissions, integrations break. Someone has to go inside the client and fix it. That someone has always been the consultancy. The labs have now decided it should be them, which is why they raised $5.5 billion to send engineers through the same door your firm walks through.
Token routing solves your bill, not your problem
The reflexive defense, once a firm sees this, is to put a control plane in front of the models: route prompts across vendors, diversify away from any single lab, negotiate on price. This is worth doing. It is also nowhere near enough.
A token-routing layer solves billing and redundancy. It does not solve the thing that actually kills a services firm, which is that every prompt sent to a lab carries the firm's methodology with it. The structure of the prompt, the sequence of steps, the way a problem gets decomposed, the domain judgment encoded in how the question is framed: all of it passes through the model. The lab does not need to steal your playbook. It watches you run it, thousands of times, across every client engagement, and the patterns are right there in the traffic.
This is the trap the large consultancies have walked into most directly: the firms piping client work straight through a model whose vendor is funding a competitor around it. A control plane that routes tokens but still ships the full workflow to the lab is charging you for the privilege of training your replacement. You diversified your billing and left your methodology fully exposed.
What a lab can sell, and what it can't inherit
The way through is to be precise about what is actually leaving the firm and what isn't. A lab can sell tokens. A lab can sell a model that scores well on a benchmark. What a lab cannot inherit is the layer that sits on top of whichever model wins this quarter:
- Workflow telemetry: the record of how work actually got done, where it broke, what the firm learned and fixed across hundreds of engagements.
- The audit trail: the reproducible account of what was decided, by whom, on what evidence, that a regulated client can stand behind.
- Accountability: when an engagement goes wrong, the client sues the firm, not the model provider. The blame, and therefore the trust, lives with the firm.
- Institutional relationship: the years of context about a specific client's business, politics, and constraints that no model has access to.
A lab can process your input. It cannot absorb your client's blame, your evidence chain, or the trust you have spent a decade building. That stack of telemetry, audit, accountability, and relationship is the moat. Keep it and the firm survives the labs moving down-market. Hand it over, prompt by prompt, and it does not.
The endgame is operator, not casualty
The conclusion is not that services firms are dead. The doom reading is tempting; the structural read is more useful. Consultancies do not get replaced by labs. The ones that survive turn into AI-native operators, where the model is an interchangeable input and the advantage comes from distribution, proprietary data, and execution at scale.
What separates the two outcomes is control. The firm that decides where the work goes, which model runs which engagement, what gets logged, and how the workflow improves over time still owns the customer. The firm that hands those four decisions to the lab is no longer a competitor to the lab. It is doing unpaid change management for it. Same models, same clients, opposite fate. The difference is who holds the layer above the model.
Holding that layer requires it to live somewhere the firm controls, separate from the runtime that executes it. In practice that means the durable part of the work (instructions, methodology, memory, the audit trail, the agent's identity) gets published to a vendor-neutral substrate, while the model is mounted at runtime from whichever lab fits the engagement. The pattern has a name: mounted agents, the architecture Tokenrip is built around. When the methodology lives on the substrate rather than inside the prompt stream, swapping OpenAI for Anthropic for a local model becomes a configuration change, and the firm's playbook stops leaking with every call. The model becomes the commodity it already is; the moat stays where it belongs.
The vendor-as-competitor audit
Before approving any AI vendor for client-facing work, a managing partner can run five questions. They are ordered from easiest to verify to hardest to fix, and any single failure means the relationship is not what it appears to be.
- Competitor risk. Does this vendor compete with us today, or has it announced intent to enter our market? For OpenAI and Anthropic in May 2026, the answer is now publicly yes.
- Pattern leakage. When we run a client workflow through this vendor, does our methodology leave the firm, and can we say exactly how much of it does?
- Telemetry ownership. Who owns the workflow logs, the audit trail, and the accountability record: us, or the vendor?
- Model portability. Can we swap the underlying model without rebuilding the workflow, or are we locked to one lab's surface?
- Client blame. If this engagement fails, can the vendor be held accountable, or is it only ever us? If only us, we are carrying all the risk and the vendor is harvesting all the learning.
A vendor that fails these is not a vendor. It is a competitor the firm is funding.
The audit maps directly onto the argument. Question 1 is the $5.5 billion in raises made concrete. Questions 2 and 3 are the methodology-leakage problem that token routing leaves untouched. Questions 4 and 5 are the line between becoming an AI-native operator and becoming someone else's change-management department. The audit will not, by itself, save a firm. It surfaces exposure without closing it, and closing it means actually moving the methodology off the prompt stream and onto infrastructure the firm controls. But it does something a quarterly vendor review usually fails to do: it forces the partnership to be named for what the cap table already shows it to be. You can keep using the models. You cannot keep pretending the firm selling them to you is only a supplier.