It is Wednesday afternoon. Your GC has two proposals on her desk for the same contract-review project: 50 commercial agreements, five-week turnaround, playbook-driven redlines, partner sign-off on anything material. Both firms are competent. Both have represented you before. Both say they will use AI end to end.

Firm A’s proposal comes in at $312,000. Firm B’s comes in at $127,000.

Your GC has been around long enough to know that a gap that wide is not a seniority-mix difference. She does what most GCs do not do. She asks each firm, in writing, to describe the AI stack they actually use to run the engagement.1

Firm A runs the work inside a legaltech platform that sits on top of a frontier model from one of the three major labs. The platform charges Firm A an annual per-seat license that bundles access to a top-shelf reasoning model (Claude Opus, GPT-5 Pro, the tier where the per-query reasoning cost is meaningful) with retrieval, workflow, and compliance layers. Firm A allocates that license to overhead and stacks associate hours on top.

Firm B runs the work direct on the lab’s API, through tooling the firm built and owns. No platform license. No vendor subscription in the middle. Same reasoning model. Same lab. Direct-API rates, not the 10x premium the lab charges the platform.

The license premium is the visible edge of a structural choice that shows up everywhere else in the invoice. [S: same starting point, different outcomes.]

This is the story of every general counsel in 2026 who has not asked how many middlemen sit between their work and the model that does it. It is happening right now at yours.

The stack between you and the model

Four layers. The frontier lab at the bottom, selling compute. The legaltech platform above it, paying the lab a 10x premium for top-shelf reasoning model access and bundling in retrieval, workflow, compliance, and a seat-license price list. The law firm above the platform, buying the platform subscription, putting it in front of associates, and delivering the engagement. You at the top, paying one invoice for all four layers with no line-item visibility into how the cost was assembled. [M: four layers, four margins, one bill.]

That stack is the default configuration for AI-augmented legal work in 2026. It is also the most expensive configuration available to you, and the one least likely to be disclosed in your engagement letter.

Where the markup hides

Three places, all inside the firm-plus-platform structure. The license premium the platform passes through (the lab’s 10x cost basis to the platform plus whatever the platform layers on top). The firm’s allocation of that license to your matter as overhead. The hours the AI was supposed to save, which in hourly engagements stay billable at the firm’s cost basis and in fee-capped engagements compress in favor of the firm. Most engagement letters do not disclose any of it. [S: nobody’s negotiating that down.]

Many firms now run several stacks at once

The dominant pattern in 2026 is not one firm, one stack. It is one firm running three or four. The senior partner has Harvey for transactional work. The associates have Claude Opus on direct API for drafting. Litigation has Lexis Protégé. Diligence has Spellbook. Each stack arrived through a different champion on a different procurement cycle. Each is competing for the attorney’s attention.

Which stack runs your matter is a function of which workflow the staffing partner picked, on which day, for which client. The economics above apply per-stack, multiplied across however many your firm is running. The invoice you receive is the composite. The label is “we use AI.”

What you are actually choosing between

Four structural choices. Most clients think they have one.

Platform-augmented firm. Default in 2026. Four layers, four margins. License premium plus allocation plus bill rate plus the lab’s 10x to the platform.

API-direct firm. Three layers. The firm absorbs the engineering cost of the build but skips the platform license. Direct-API rates flow through to lower cost of service.

In-house with direct API. Two layers. Viable for repeatable, volume-heavy work (NDA triage, vendor screening, clause extraction). Not viable for judgment-heavy work that requires a licensed and malpractice-insured signature you do not have on staff.

No AI. The 2019 firm. Costs you what 2019 cost you.

The honest question is not “AI or no AI.” It is “how many margins do you want in the stack, and which ones are buying you something you actually need?”

If you’re evaluating a legal AI platform’s pricing against doing the same work in a firm or directly via API, the math is rarely in the platform’s favor.

Talk to a Talairis attorney →

Why platform-augmented firms price where they do

Platform-augmented firms are not over-charging. They are pricing against a cost structure that includes the license fee they pay every month. The partner who signed the platform agreement in 2023 did the deal the market offered at the time.

The platform’s own compute cost basis is higher than what a direct-API user pays, not lower. The lab charges the platform roughly 10x what it charges any small business with a credit card on the public API, on top-shelf reasoning models. The platform pays that premium because it is the connection. The license the platform charges your firm has to cover the 10x cost basis plus everything the platform layers on top. The platform’s own margin is opaque, frequently VC-subsidized, occasionally negative per-customer.

That license contract is auto-renewing at 8 to 15 percent annually while the lab’s public-API price drops 50 to 75 percent annually. The firm is two or three years into a spread that compounds against them. The economics flow through to the engagement letter.

Why API-direct firms can price where they do

An API-direct firm decided the platform layer was priced out of line with the value it delivered. That build has a real cost: engineers, evaluators, prompt architects, compliance review, security posture, ongoing maintenance against a model that updates every quarter. The cost is capex, spread across matters.

At steady state, the firm pays the lab the direct-API rate. Not the 10x the platform pays for the same model. The firm captures efficiencies as they arrive from the lab. The firm’s cost of serving any given engagement is structurally lower.

Whether the client sees that lower cost depends on whether the firm passes it through. Structure enables the savings; structure does not deliver them. [M: structure isn’t pass-through.]

What the platform-augmented stack does buy you

The platform is not selling nothing. A mature legaltech platform provides retrieval over a proprietary closed library, domain-specific fine-tunes, a vendor-carried compliance posture (SOC 2, HIPAA, jurisdictional data residency), a malpractice-tested workflow with audit trails, and a customer-success layer the firm uses when the tooling breaks on deadline.

For volume contract review against your own playbook, most of that does not matter. For securities litigation discovery across 40 million documents, much of it does. The defensible version of the platform markup is 2 to 4x. Real value for real work. The 10x version is rent on a market that has not yet priced the alternatives in.

What to do

The cemented answer: ask your firm, in writing, what AI stack they use on your matter. Specifically: which frontier lab’s model, which legaltech platform (if any), and whether the firm’s workflow runs direct on the lab’s API or through a third-party wrapper. Most firms have never been asked. The answers tell you how many middlemen you are paying.

Beyond that, options.

Treat “we use AI” as the start of a diligence conversation, not the end. Two firms both using AI can be running structurally different economics; the marketing language is identical, the invoices are not. Price the stack, not the firm: in RFPs, ask for a breakdown of which layer of the stack each dollar is buying. Renegotiate engagement letters for matters that went AI-augmented after signing.2

Get counsel before the next engagement

The stack between you and the model is not disclosed in your engagement letter, and your firm has no obligation to disclose it unless you ask. The firms that will tell you, cleanly, with numbers, are the ones that have already done the work to run their own economics. They exist. They are not most of the market.

Have the conversation before the next material matter gets staffed. The stack that runs your work gets chosen at the start of the engagement, and almost nothing changes it once it is running.

A closing thought

The legal market split in 2026. On one side, firms that buy AI from a platform and pass the license premium through to the client on top of bill rates. On the other, firms that run direct on the lab’s API and deliver the same work at a structurally lower cost.

Both sides use the same words on the pitch deck. The invoices are not the same.

In 2026, the legal buyers who win are the ones who count the middlemen.

Footnotes
  1. Most firms have never been asked, in writing, what AI stack they’re running. The answers separate cleanly. Some firms answer in two sentences with the lab name and the workflow. Some firms refer the question to procurement and never come back. The first kind is who you want on the matter. — Sam
  2. Engagement letters in 2026 don’t address the AI stack. The model rules of professional conduct don’t compel disclosure. The malpractice carriers don’t require it. Until one of those changes, the stack remains a procurement question, not a legal-ethics question. [M]