FAQ — Trinito AI Gateway

Q: What is the Trinito AI Gateway?

An on-premise appliance that redacts UK personal identifiers and contextual business references before prompts reach public LLMs.

Q: Why an appliance rather than software or SaaS?

Redaction and audit must run in your building; SaaS cannot make the same claims.

Q: How does this compare to Microsoft Purview?

Purview needs E5 or add-ons and endpoint agents; Trinito uses the on-appliance chat with redaction before egress and on-appliance audit logs.

Product

What the appliance is, what it connects to, and what happens if we're not here in five years.

What is the Trinito AI Gateway?

A small fanless appliance on your office network that sits between your team and the public LLMs they already use. Every prompt is inspected before it leaves the building. Redacts UK personal identifiers (NI, postcodes, NHS numbers, etc.) and contextual business references (claim numbers, case refs, NHS client IDs) that could identify a client or matter. The model still gets a useful prompt; your confidential fragments do not cross the wire in cleartext. On the way back, placeholders are restored so the answer reads naturally.

Staff use the Trinito chat on the appliance — the governed surface where every prompt is inspected before it leaves the building. Configure your OpenAI, Anthropic, or Google keys (or use Trinito Cloud / local models); redacted prompts go to the provider you chose, responses come back rehydrated on the box. You get an audit trail on hardware you own. It is an AI firewall, not a replacement for ChatGPT. See what the Gateway does and how the pipeline works.

Why an appliance rather than software or SaaS?

Architecturally, the on-premise box is what makes the rest of the product's claims literally true. Your prompts never leave the office unredacted only works if the sanitiser runs in the office — move it to our cloud and your data passes through our cloud to be sanitised. The audit log is hash-chained on your hardware so it survives us going out of business or being subpoenaed. Staff use the Trinito chat as their AI workspace — one place for sanitisation, routing, and audit. None of these work as SaaS.

Practically, the appliance is the shape that fits the SMB sale: a known, tested platform with NPU-accelerated inference from £2,199 — a purchase order and a 30-minute demo, not a six-figure annual contract. There is no version of this product where your confidential data leaves your office in the clear. That is the point. Read why an appliance on the product page.

Will it slow my team down?

In practice, no — not in the way that makes people bypass it. Redaction and approval add under 300ms on a typical Standard deployment; most of the wait is still the LLM you would have called anyway. Responses stream as they arrive in the chat UI.

What slows teams down is a separate "compliance portal" they refuse to open. We optimised for the path they already use. After the first week, most users stop noticing the extra step — unless the system catches something they would have regretted sending. If latency matters for a specific site, we benchmark on your network during the demo.

Which LLMs does it work with?

Out of the box: ChatGPT (OpenAI), Claude (Anthropic), and Gemini (Google), via your keys or our managed allowance. On the same appliance you can run local models — Qwen 2.5, Llama 3, Mistral — for work that must not leave the LAN at all.

Admins add models from a curated catalogue or bring their own API keys. Group-based rules control who may use which model. You are not locked into a reseller markup on tokens unless you choose our bundled allowance. Full list and routing behaviour are on the AI Gateway page.

What happens if you go out of business?

The appliance keeps working. The sanitiser, chat UI, audit log, and local models run with no Trinito infrastructure dependency. You will not receive new software updates or Project Packs until you arrange support elsewhere.

If you use Trinito Cloud and we are gone, switch to BYO keys (OpenAI, Anthropic, your own accounts) — the model picker already supports this. Escrowed source-available builds exist for active customers. More in our data path section.

Can I use my own OpenAI key?

Yes — first-class supported. Open the models admin page, paste your provider key, choose which users can use that model. Your key is stored encrypted on the appliance (libsodium secret-box) and used directly from the box; Trinito's servers never see it.

Redaction still runs before egress. Mix BYO with Trinito Cloud on Compact and Standard, or run local-only on Sovereign. See pricing for allowance and overage rates.

Document library

Org knowledge, chat attachments, classification, and what stays on the appliance.

What happens when I upload a document — does it leave the appliance?

No. Documents stay on the appliance. Text extraction runs locally — Apache Tika for documents and spreadsheets, Tesseract (via Tika) for images and scanned PDF pages. Sanitiser detection runs locally. Embeddings are generated locally. The only thing that ever leaves the appliance is the sanitised version of a prompt, alongside any retrieved document chunks, and only when you (or a user) sends a chat to a cloud LLM.

Classification-aware blocking of restricted documents from cloud-routed chats is not part of the product today. Outbound sends pass through the sanitiser and Pre-Send Preview; documents are labelled, cited, and logged. See document library and how uploads work.

What if a document is supposed to contain real names? My brand guide names the founder.

The chip-review modal at upload time shows you every name the sanitiser detected. You can click any chip to release it — the name stays in the document and is available to the chat as intentional context. Every release is recorded in the appliance audit log against your admin account, so the trail is auditable to your DPO or to a regulator if needed.

How does the appliance know if a document is confidential?

Two ways. First, if your document already carries a classification marking — CONFIDENTIAL in the footer, INTERNAL USE ONLY on the cover, UK Government protective markings, NATO markings — the appliance detects the marking and proposes the matching sensitivity level automatically. Second, you can set or override the sensitivity in the admin pages. The four levels (Public, Internal, Confidential, Restricted) are labelled everywhere the document appears.

Can users see documents from other users' personal libraries?

No. The scope filter is enforced at the database query level. The retrieval query physically cannot return rows from another user's personal library — the SQL restricts results to the organisation's shared documents plus the requesting user's own documents plus the current conversation's attachments. Cross-user retrieval is architecturally impossible, not a policy choice.

Security

How redaction works, what we cannot promise, and where the logs live.

How does the redaction work?

Layered — by design, not one magic model. A regex pass catches UK personal identifiers: postcodes, National Insurance and NHS numbers, VAT numbers, IBANs, Luhn-validated payment cards, emails. A local NER model catches person names, organisations, places, and money references. Pattern packs flag contextual business references — claim numbers, case refs, NHS client IDs, and similar — that could identify a client or matter.

Findings merge into a single Pre-Send Preview. The user approves the sanitised prompt before anything leaves. We explain the pipeline openly on How it works — because "trust us, it's AI" is not a control.

Can I trust it?

More than the alternative — which is trusting every employee to remember the policy at 6pm on a Friday. We are honest about false negatives: no classifier is perfect. The safety net is user review in Pre-Send Preview before send.

What you can trust is the process: every session logged, redaction decisions visible, export for your auditor. Compare that to shadow ChatGPT on a personal phone where you have zero telemetry. Our security page and printable procurement PDF spell out architecture without marketing adjectives.

What happens if a redaction misses something?

The user sees the sanitised prompt before it sends. If they spot a miss — a client name the NER did not know, a codename the pattern pack missed — they edit in place before approving send.

You run a short retrospective with us after go-live: review the first week's audit log and tune where your sector demands it. A miss is a process improvement, not a silent breach — unless someone deliberately bypasses the Gateway entirely. Read what UK firms actually paste into ChatGPT.

Where are the audit logs stored?

On the appliance — append-only, hash-chained SQLite, encrypted at rest. They do not sync to our cloud by default. The log stores cryptographic hashes of prompt and response content — not the content itself — plus metadata (timestamp, user, provider, action). Export on demand for regulators, insurers, or your own SIEM.

That custody model is deliberate: when the ICO or your professional body asks who sent what to which model, you hand them your export — not a ticket to a US vendor's retention policy. Air-gapped Sovereign deployments keep the same store; updates arrive by USB, not tunnel. Technical detail on the audit log section.

Is this for FCA / SRA / NHS?

We built Sovereign for regulated professional firms — FCA-supervised advisers, SRA-regulated practices, healthcare admin working to DSPT — who need air-gap options and an assurance pack an auditor can follow. We do not sell "compliance in a box"; we sell inspection, logging, and UK support when your reviewer has questions.

Whether it satisfies your specific obligation still depends on your DPIA and client contracts. We will walk your compliance lead through the first review at no extra cost on Sovereign. Start with the security overview and talk to us if your regulator's name is in the subject line.

What happens to files I upload — do they leave the appliance?

Files you drag into chat are uploaded to the appliance and stored locally. Tika extracts text on the box, then classification and sanitiser review run locally before the sanitised text is embedded in the prompt sent to the LLM. The original file is never sent to any LLM provider.

For organisation-wide knowledge (handbooks, policies), admins upload via the Documents page — same extraction stack, indexed for retrieval across the org. Images with no extractable text fail processing (extraction failed) rather than being sent to a model. More in Document library and AI Gateway.

Deployment

What install day looks like and what you do not have to change.

What does installation involve?

Rack or shelf the appliance, connect power and Ethernet, reach the local web UI from a management workstation. Create user accounts (email and password on the appliance; optional TOTP for admins). Pick default models and run a test prompt that looks like your sector.

We run a 60-minute onboarding with your IT team: accounts, model routing, and a walkthrough of Pre-Send Preview. Most firms are live the same day. No cloud tenant to provision on our side. Step-by-step context on How it works and deployment notes on the Gateway page.

Do I need to change my network?

No redesign. The appliance makes outbound HTTPS to the LLM providers you allow — same as a desktop today, but via inspection. Inbound, it serves the local chat UI on the LAN. No inbound port forwards, no public IP, no hairpin through a US proxy you do not control.

Remote management uses a signed WireGuard tunnel — opt-in, off by default on sensitive sites. If you segment VLANs, we document which paths need to reach the box; most SMBs run it on the office LAN staff already use for work.

How do users sign in?

Each user has an account on the appliance — email and password, with optional TOTP for administrators. Admins create and manage users from the admin console; group-based rules control who can use which models.

Enterprise SSO (SAML, OIDC, LDAP) is not part of the product today. If you need directory integration, talk to us — it may be available as a bespoke engagement.

Does it work with Active Directory / Entra ID?

Not as a built-in sign-in integration today. Users authenticate to the appliance directly. Your existing Entra conditional access policies still apply to the workstations people use to reach the chat UI on the LAN.

Can it be air-gapped?

Yes — on Sovereign. Local models ship on the appliance; signed updates apply by USB; the remote-management tunnel stays disabled. Staff still get capable AI for drafts and summaries; nothing leaves the building unless you explicitly allow a redacted egress path later.

Air-gap is not free convenience: you trade frontier-model flexibility for custody. The sanitiser still has a role even when the LLM is local — see why you still need the sanitiser with a local LLM and the Sovereign write-up.

How are updates delivered?

Default: signed updates through an optional WireGuard tunnel you control the window for — evenings or weekends, your timezone. Security patches ship as they are ready; model catalogue updates follow the same channel.

Air-gapped sites receive quarterly USB bundles plus out-of-band security fixes when needed. Firmware, redaction engine, and chat application are versioned together so you are not chasing multiple vendors. Policy details in the security pack.

Sovereign and air-gapped deployments

Why the sanitiser stays in the path when the LLM runs locally on the appliance.

Do I still need the sanitiser if my LLM is local?

Yes — for four reasons that the local LLM does not address.

Every prompt, whether it is headed to OpenAI or to a local Llama running on the appliance, has to enter the model's context window during inference. The context window is process memory in the inference engine; engines like Ollama, llama.cpp and vLLM additionally persist KV caches and conversation history to disk for performance. That means anyone with appropriate access on the appliance — an administrative shell, a backup tape, another user account on a shared box — can read what was processed. Running the model locally takes the LLM provider's access surface to zero. It does not reduce the appliance-side access surface, and most of the value of the sanitiser is in addressing exactly that.

The four reasons:

Multi-user access governance. A shared appliance has multiple users. Without the sanitiser and the audit log, the inference engine's caches and conversation history become a shared exposure surface across users with shell access. With the sanitiser, every prompt is tied to a user, every redaction decision is logged, and per-user placeholder maps stay scoped to their owner.
Records of processing. UK GDPR Article 30 and ISO/IEC 27701:2025 records-of-processing controls (A.7.5.3, A.7.5.4, B.8.6, B.9.2) require evidence of every disclosure to a processor. The local LLM is still a processor. The audit log is the structured evidence; without it, the customer has no answer when an auditor or the ICO asks what was processed and when.
Fine-tuning protection. Sovereign customers commonly fine-tune the local model on their corpus. Fine-tuning without sanitisation bakes PII into the model weights, where training-data extraction attacks can recover it. If the fine-tuned model is ever backed up, returned for warranty, or shared between offices, that PII goes with it. Sanitising the corpus before training keeps the weights placeholder-only.
Air-gap drift. Most "air-gapped" deployments are not, after twelve months. Temporary management networks left on, USB sticks bridging environments, accidentally-enabled WiFi on printers — these are the documented failure modes. The sanitiser is a second control that limits blast radius if the air-gap holds less perfectly than planned.

For deployments where a local model should receive the original text for drafting quality while still recording detections, admins can enable detection-only for local in /admin/security: the sanitiser still runs on every prompt and writes the redacted copy to the audit trail; only the outbound call to the on-box local model uses the original text. Cloud-routed sends always use the sanitised version. Documented risk acceptance applies.

Read the longer write-up on the ISO 27701 compliance page →

Commercial

Money, trials, and what you actually own.

How much does it cost?

Three tiers, ex VAT: Trinito Compact from £2,199 hardware plus £39/month (5M tokens included). Standard from £2,499 plus £79/month (15M tokens). Sovereign from £3,499 hardware with custom monthly for air-gapped and regulated deployments.

Hardware is a one-time purchase you own. Monthly covers allowance, updates, and UK support — or bring your own API keys and treat the fee as software/support only. Toggle inc VAT on the pricing page; charity and education discounts available on request.

Is there a free trial?

Start with our curated examples on the website — realistic UK prompts sanitised the way the appliance would. For production, the appliance is configured for your network; we can arrange a two-week pilot on your LAN for larger or regulated opportunities.

Questions after trying it? Ask us directly — we respond within one working day, often the same hour.

Can I lease instead of buying?

Yes — 24- or 36-month leases through UK finance partners on Standard and Sovereign. Monthly lease figures appear on your quote alongside the ex-VAT purchase price so finance can compare both.

Compact is purchase-only today because of ticket size. At end of lease you return, renew, or buy out per the lessor's terms — the software licence and your audit data stay yours regardless. Ask on the quote form if you need a pro forma for the board.

What's the warranty?

Three years on hardware — next-business-day replacement in the UK. If a fanless unit fails, we ship a replacement; you return the faulty unit in the prepaid packaging.

After year three, extend warranty annually or run to end-of-life — security and software updates continue either way because they are tied to the appliance, not to a support subscription that expires silently. Full terms on your order confirmation; ask if you need insurer-friendly wording.

What's included in the monthly fee?

On Compact and Standard: LLM token allowance (5M or 15M per month), security and model updates, UK email and phone support, and access to the managed catalogue. Overages bill at the rates on the pricing page — or switch to BYO keys and stop paying us for tokens.

Not included: your own OpenAI/Anthropic invoices when you BYO, on-site cabling, or unlimited professional services. Cancel monthly on 30 days' notice; the box keeps working on your keys or local models. Sovereign monthly is POA and includes air-gap packaging and assurance support — quoted per deployment.

Comparison

How we sit next to the suites you already pay for.

How does this compare to Microsoft Purview?

Purview's AI and third-party LLM features sit inside Microsoft 365 E5, or as a paid add-on on E3 and Business Premium. Most UK SMBs on Business Standard or below do not have it without a substantial per-user uplift.

Even when licensed, Purview's protection for third-party LLMs typically requires Edge for Business with Entra sign-in — an endpoint agent on every device you want covered. Trinito installs nothing on staff laptops or phones: staff use the Trinito chat on the appliance, prompts are redacted before they reach OpenAI, Anthropic, or Google. Audit logs stay on your appliance, not Microsoft's cloud. On headline maths, we are roughly one tenth the per-user cost of an E5 uplift for comparable third-party LLM egress control at fifty seats.

Purview wins if you are all-in on Microsoft already. Trinito wins on mixed browsers, BYOD, and logs you export yourself. Full table: Trinito vs Microsoft Purview.

What if we use Google Workspace?

Trinito is office-suite agnostic. Staff use the Trinito chat whether you run Microsoft 365, Google Workspace, or neither. Google's Workspace DLP helps with classified content inside Google apps; it does not inspect what someone pastes into a public LLM from their browser. Trinito redacts AI prompts before they leave your building.

Questions, grouped.