Architecture
Why an appliance, not SaaS or software-only
The on-premise architecture is the product. Move the sanitiser to the cloud and your data has to travel through that cloud to be redacted — the exact problem this product exists to solve. The appliance is what makes every other promise on this page literally true.
- 1. Prompts never leave unredacted Redaction runs in your building. SaaS prompt-firewalls handle raw content in transit to do their job. We do not — because the box is on your LAN.
- 2. The audit log is yours If Trinito vanishes, the log remains. A regulator gets a USB export from hardware you own — not a tenant in someone else's cloud.
- 3. Network-level coverage Every device on the office network — laptops, phones, iPads, BYOD — not only browsers with an agent installed.
- 4. Hardware acceleration included The integrated NPU runs sanitisation and local inference fast without bolting on extra kit.
- 5. One platform to support Tested updates on a known box — not "any Linux VM you happen to have."
Redaction pipeline
Five stages. Layered defence.
Each stage has a job. Together they catch UK personal identifiers, contextual business references that could identify a client or matter, and your internal IDs.
-
01
Regex pass
UK personal identifiers — postcodes, NI and NHS numbers, VAT numbers, IBANs, sort codes, Luhn-validated cards, email, phone — plus contextual business references (claim numbers, case refs, NHS client IDs) where pattern packs apply. Fast and precise.
-
02
Named entity recognition
A local spaCy model finds person names, organisations, places, and money references that no regex can reliably catch.
-
03
Custom rule pack
Custom rules for your internal IDs, project codenames, internal product names, and supplier names. Loaded per organisation by your admin.
-
04
Optional LLM cross-check
A small local model gets a second look at the prompt; configurable per organisation, off by default for latency, on by default in regulated tiers.
-
05
Deduplication and approval
Findings are merged, the user sees the sanitised version, and one click sends it.
Before
Draft an offer letter for Sarah Patel for the 3-bed flat at 14 Cromwell Road, SW7 4XL. Her solicitor is at Henderson & Co.
After
Draft an offer letter for [PERSON_1] for the 3-bed flat at [ADDRESS_1], [POSTCODE_1]. Her solicitor is at [ORG_1].
On the way back, placeholders are restored so the letter reads naturally.
Attachments
Files stay on the appliance. Only sanitised text is sent.
When someone drags a document onto the chat, the file is uploaded to the appliance and held locally. On-device extraction pulls text from PDFs and Word docs, cells from spreadsheets, and characters from scanned images. That text runs through the same five-stage sanitiser as a typed prompt.
What reaches the LLM is sanitised text embedded in the prompt. The original file is never sent to any provider's attachment API. This keeps the architecture provider-neutral and the data-residency claim absolute.
Supported at launch: PDF, DOCX, XLSX, CSV, PNG/JPG (OCR), and TXT. Spreadsheets are particularly useful — sensitive cells can be redacted while the model still analyses structure. Images with no readable text cannot be sanitised automatically; the user sees an override flow with a reason field and a full audit entry.
LLM router
Use any model. Control who uses what.
The Gateway can route to:
-
Local models on the appliance
Qwen 2.5, Llama, and others — included with the appliance.
-
Trinito Cloud
Our managed subscription — monthly token allowance on Compact and Standard, customer-cancellable. Free starter allowance bundled so the box is useful from day one.
-
Your own keys
BYO OpenAI, Anthropic, and Google. Keys stored encrypted on the appliance (libsodium secret-box) and used directly from the box.
In every case, the appliance talks to the LLM provider directly. Trinito's servers are not in the prompt path — we never see the prompt, response, or your API key. Our licensing server only issues signed config (subscriptions, caps) on a daily check-in. The admin chooses per-model access, credentials, and catalogue additions.
Audit log
Every prompt processed. Every redaction. On the appliance.
An append-only, hash-chained audit log records every prompt processed, every redaction decision, and every external send. The log stores cryptographic hashes of prompt and response content — not the content itself — so we can evidence what happened without retaining the underlying personal data. Tampering with the chain is detectable on export; the database enforces append-only behaviour via a row-level trigger. Compliance can export from the appliance on demand.
Conversation history — which does retain prompt and response text for user reference — lives in a separate per-user store on the appliance, with per-conversation deletion controls and full erasure on user request.
Hardware specs
Three appliances. Capability scales with tier.
Specs-first overview. See pricing for list prices.
|
Trinito Compact |
Trinito Standard |
Trinito Sovereign |
| CPU / NPU |
8-core CPU with integrated 50 TOPS NPU |
8-core CPU with integrated 50 TOPS NPU |
12-core CPU with integrated 80 TOPS NPU |
| Unified memory |
32 GB |
64 GB |
96–128 GB |
| Storage |
1 TB NVMe |
3 TB NVMe |
4 TB NVMe |
| Inference throughput |
~28 tok/s on Qwen 2.5 7B |
~45 tok/s on Qwen 2.5 7B |
~80 tok/s on Qwen 2.5 7B |
| Noise level |
Near-silent (fanless) |
Near-silent (fanless) |
Near-silent |
| Power draw |
~28 W typical |
~32 W typical |
~90 W typical |
| Dimensions |
192 × 192 × 48 mm |
192 × 192 × 48 mm |
262 × 197 × 80 mm |
| Warranty |
3 years |
3 years + priority support |
3 years |
Browser extension
Phase 2
Coming next: ChatGPT, Claude, and Gemini in the browser.
The Trinito extension for Chrome and Edge will route prompts from
chat.openai.com,
claude.ai, and
gemini.google.com
through your office Gateway — Prompt Shield in the toolbar when enabled. Until it ships, staff use the Trinito chat UI and API; network-level controls still apply to traffic through the appliance.
Get notified when the extension ships
Integrations
Shipping in v1. Planned next.
Shipping in v1
REST + streaming API
Drop-in alternative to the OpenAI API shape — prompts redacted on the appliance before they leave your network.
Planned
Microsoft Teams bot
Mention the bot in a channel; responses return rehydrated. Audit-logged on the appliance.
Planned
Slack bot
Same model as Teams. Per-channel policy and per-user authentication via SSO.
Deployment
What installation looks like.
- Plugs into your office network
- First-boot configuration via local web UI
- Active Directory / Entra ID single sign-on
- MFA out of the box
- Remote management via Trinito's secure tunnel (opt-in, signed WireGuard, off by default for air-gapped customers)
Comparison
Four options, one that actually works.
The homepage table, expanded for buyers who need the detail.
|
Do nothing |
Block AI tools |
SaaS DLP |
Trinito AI Gateway |
| Staff use AI |
Yes |
Only on phones |
Yes |
Yes |
| Data stays in your office |
No |
Yes |
No — via vendor |
Yes |
| Audit trail |
None |
Partial |
Vendor-hosted |
On-appliance, hash-chained |
| Works with ChatGPT / Claude / Gemini |
Yes |
No |
Some |
All three, plus more |
| Capex, not per-seat |
— |
— |
Per-seat |
One box, monthly LLM |
| Custom rules you control |
— |
— |
Vendor-controlled |
Per-organisation rule pack |
| Air-gapped deployment |
— |
— |
No |
Sovereign tier |
| UK-built |
— |
— |
Mostly US |
Yes |