Three architectures, one awkward question
UK procurement teams evaluating “AI DLP” in 2026 usually receive three pitches. The first is a cloud service that sits between your users and OpenAI. The second is an on-premise AI firewall (or AI gateway) that redacts before prompts leave your building. The third is to abandon public models and run local-only LLMs on hardware you own. All three claim to solve confidentiality. They do not solve the same problem, at the same cost, with the same trade-offs.
This guide is comparison-led and UK-specific — written for IT directors, MSPs, compliance officers, and founders who need a board-ready answer without vendor euphemisms. We sell an on-premise gateway, so apply appropriate scepticism; we will flag where our product fits and where it does not.
Quick comparison
| SaaS AI DLP | On-prem AI firewall / gateway | Local-only models | |
|---|---|---|---|
| Where inspection runs | Vendor cloud (often US/EU) | Your office / your VPC | Your office / your VPC |
| Who sees the cleartext prompt | Vendor + model provider | You (appliance) then provider sees redacted version | You only |
| Frontier models (GPT-4 class) | Yes, via vendor proxy | Yes, after redaction | Smaller models only unless you add huge GPUs |
| Audit log custody | Vendor SaaS retention | Your appliance / your export | Your infrastructure |
| Typical buyer | Global enterprise with legal team for DPAs | UK SMB / regulated professional firm | Air-gapped or extreme sensitivity |
| “AI DLP UK” fit | Strong SEO, variable residency story | Strong if data must stay in-country | Strong for no egress, weak on capability |
Option A: SaaS AI DLP (cloud proxy)
How it works
Staff install an agent, point DNS or a browser extension at a vendor endpoint, or route API keys through a multi-tenant cloud. The vendor applies classifiers — regex, ML, sometimes human review queues — then forwards an allowed prompt to OpenAI, Anthropic, or Google. Responses may be scanned on return.
Strengths
- Fast to pilot. No hardware lead time; often credit-card signup.
- Central policy. One dashboard for all users if you can get them to use the path.
- Mature integrations with CASB/SSE stacks in large enterprises.
Weaknesses for UK SMBs
- Another processor in another jurisdiction. Your prompts — even redacted — still flow through US infrastructure you do not control. UK GDPR and client DPAs ask where processing happens; “we are SOC 2” is not the same as “data stayed in the UK.”
- Cleartext exists in the vendor cloud. Redaction after ingest means the vendor saw the original unless architecture proves otherwise. Read the sub-processor list carefully.
- Staff bypass. If the proxy is slower or blocks legitimate work, users open chat.openai.com directly — you are back to shadow IT without LAN-side logs.
- Per-seat SaaS creep. Three-year TCO competes with a capital appliance for firms under 200 seats.
When SaaS DLP wins
You are a multinational with a privacy office that negotiates DPAs for a living, you already standardised on a SSE vendor, and UK residency is one region among many — not the default constraint.
Option B: On-premise AI firewall / gateway
How it works
A small appliance (or hardened VM) on your LAN terminates HTTPS from a browser extension or local chat UI. Detection runs locally: UK regex packs, NER, your client list. The user approves the sanitised prompt. Only then does it leave for the public model — placeholders in flight, restoration on return. Audit logs stay on the box unless you export them.
Strengths
- Inspection in-country. The only place original prompt and original response meet is hardware you own — aligned with how many UK professional firms describe their obligation to clients.
- Frontier capability without giving up ChatGPT-class quality for the tasks local 7B models still bungle.
- Logs you can hand to an FCA, SRA, or ICO reviewer as CSV from your environment, not a vendor ticket.
- Works with your own API keys — you are not locked into a reseller margin on tokens.
Weaknesses
- Capital and rack space. You buy or rent hardware; someone must patch it.
- Redaction is probabilistic. User-in-the-loop and custom rules are mandatory, not optional.
- Redacted prompts still go to US model providers — you have minimised content, not eliminated cross-border transfer. Document that in your DPIA.
When an AI gateway wins
You need staff on public frontier models this quarter, your clients care where inspection happens, and you want one product in one country rather than another US SaaS dependency. Search terms like AI DLP UK often land here once buyers realise cloud DLP still exports prompts.
Option C: Local-only models
How it works
You run Llama, Qwen, Mistral, or similar on GPUs in your office or a UK cloud tenant. No call to OpenAI. Some firms air-gap entirely; others allow outbound only for updates.
Strengths
- Strongest story for “nothing left the building.”
- No per-token bill to San Francisco at scale — after hardware is sunk.
- Regulators and insurers understand “on prem” faster than “redacted egress.”
Weaknesses
- Capability gap. Local 7B–13B models are fine for drafts and extraction; they are not interchangeable with frontier models on complex reasoning, long context, or nuanced professional writing.
- Ops burden. GPU drivers, model updates, quantisation choices — your MSP becomes an ML shop.
- Staff still want the public tab unless you give them a governed path to frontier quality — which brings you back to Option A or B.
When local-only wins
Air-gapped sites, classified-adjacent work, or policies that forbid US inference entirely — and leadership accepts lower quality or smaller tasks only.
Decision matrix for UK buyers
Use this as a workshop handout — score 1–5 for your firm.
| Question | SaaS DLP | AI gateway | Local only |
|---|---|---|---|
| Must inspection happen in the UK? | Often no | Yes | Yes |
| Need GPT-4 / Claude Sonnet class weekly? | Yes | Yes | Rarely |
| Board wants vendor-in-US to see cleartext? | Usually yes | No (you see it) | N/A |
| Can you ship hardware to office? | N/A | Yes | Yes |
| IT team size < 5 FTE? | Maybe | Yes | Hard |
Regulatory lens: DMCC, DPA, and the EU AI Act
None of these products “make you compliant” in a box. They change evidence.
UK GDPR / DPA 2018 — You still need lawful basis, minimisation, and transfer tools if US inference remains. An on-prem gateway supports minimisation; it does not replace a Transfer Risk Assessment.
DMCC and consumer-facing fairness — Mostly B2C, but professional firms feel downstream pressure on how client data is handled in automation.
EU AI Act — Relevant if you serve EU clients or have EU establishments; logging and human oversight expectations align with gateways that produce audit trails, regardless of hosting model.
Post-launch, we publish shorter notes when guidance shifts — the architecture choice should survive a regulatory headline.
Cost and TCO — a rough UK framing
Exact numbers depend on seat count and token volume, but directionally:
- SaaS DLP — recurring per user per month, plus implementation services. Three-year cost often exceeds a mid-market appliance once you pass fifty seats, and you still pay model tokens separately unless the vendor resells at a markup.
- AI gateway appliance — capital or rental hardware, annual support, your own API keys. Predictable for finance; no surprise true-up when usage spikes during month-end.
- Local-only cluster — GPU rental or purchase dominates. Cheapest inference per token after sunk cost; most expensive if utilisation is low because only legal uses it twice a week.
For a fifty-person firm already spending on M365 and a decent firewall, the gateway line item is usually compared to one SaaS DLP SKU — not to building a private OpenAI. Run the comparison on three-year paper with realistic bypass risk, not on list price alone.
Hybrid architectures that work in production
Most UK firms we speak to land on a hybrid, not a purity contest:
- Gateway + public models for daily professional work with redaction.
- Local model on the same appliance for summaries that must never leave — HR tickets, privileged notes, early drafts.
- SaaS DLP only if they are a division of a US parent that already standardised on one cloud inspector.
The mistake is buying two overlapping inspectors and wondering why staff hate both.
Procurement questions to ask every vendor
- Where is the cleartext prompt processed — list regions and sub-processors.
- Can we export hash-chained audit logs without a support ticket?
- What happens when staff bypass your extension — do you detect it?
- Do you resell API access or can we use our own OpenAI keys?
- What is the false negative process — user approval, custom rules, SLA?
- For UK customers, who answers the phone when ICO or FCA asks for evidence?
Bottom line
SaaS AI DLP is the fastest slide into a US-controlled trust zone — fine at scale, awkward for UK SMBs selling trust on data residency. Local-only models are the strongest air-gap story with the weakest general capability. An on-prem AI firewall / gateway is the middle path: frontier models, domestic inspection, logs you own — with honest disclosure that redacted traffic may still hit US inference.
If you are ranking for “AI DLP UK,” compare on where cleartext lives, not on how many logos fit on a slide. Then book a demo that shows redaction on a real client name from your sector — not a generic “hello world” prompt.