Per-engineer cost attribution, hard budgets, egress governance, and context-rot observability — running entirely inside your VPC. Drop it on top of LiteLLM, Portkey, or direct Anthropic / OpenAI / Bedrock / Azure. Your code, keys, and prompts never leave your perimeter.
one env var to adopt · zero client changes · open source · runs offline
The problem
Engineers want Claude Code and Cursor. Security wants to know what's leaving the building. The standard answers don't work:
Engineers don't stop — they switch to personal accounts and paste production code into consumer chatbots. You didn't remove the risk; you made it invisible. Practitioners across regulated orgs report exactly this.
A shared API key, no per-engineer attribution, no budget caps, no record of what was sent. A vendor contract covers training-data use — it gives you zero internal visibility or control.
The real choice isn't AI vs. no AI. It's governed AI vs. shadow AI.
How it works
Point your tools' base URL at Conduit. Every AI request flows through a control point you own — then on to the providers you already use.
Notice what's missing: us. Conduit ships as signed Docker images you run — no vendor cloud in the request path, no phone-home, fully air-gappable.
What you get
Bedrock private endpoints and no-train agreements secure the channel to the vendor. Conduit is the other half — what your own org sends, spends, and can prove.
Spend by key, model, team, and day — the breakdown a shared API key structurally can't produce. When finance asks "who's the $40k," you have an answer instead of one giant invoice.
Secrets caught before they leave the perimeter, plus a per-org entity allowlist for your customer names, internal codenames, and deal codes. Run in alert, watch the false-positive rate per category, promote high-confidence ones to block — one at a time. Records the category, never the value.
Engineers hold revocable stand-in keys; the real credential is AES-256-GCM-encrypted in the gateway and never touches a laptop.
Fail-closed caps, per-key rate limits, model allow-lists. A disallowed model is a clear 403 — never a silent downgrade.
Every request as a metadata-only record — CSV/JSON export that makes a security review short.
Conduit buckets every request by input-token size and shows the cost-and-error curve. Past 32K tokens, error rate and latency climb faster than cost — that gap is what your team is paying for context rot. We never modify the prompt (that would break Anthropic's prefix cache and the byte-transparent promise) — Conduit surfaces the bill, the client / agent does the mitigation.
Direct Anthropic/OpenAI — or your AWS Bedrock / Azure OpenAI private endpoint. Configure that credential and Conduit routes through it, SigV4 and all, with no client change.
Drop-in compatibility
Conduit doesn't ask you to rip out your gateway. Point your tools at Conduit, point Conduit's upstream at whatever serves your provider traffic today — one env var.
Already proxying 100+ models through LiteLLM? Set UPSTREAM_ANTHROPIC_URL=http://litellm:4000 and Conduit becomes the governance + attribution + audit layer above it. Full walkthrough: docs/run-with-litellm.md.
Portkey-shaped upstream works the same way — point Conduit at your Portkey URL, keep all your routing and fallback config, add the org-level control plane on top.
No gateway today? Conduit talks straight to Anthropic, OpenAI, AWS Bedrock (SigV4), or Azure OpenAI. Pick whichever credential your security team already approved.
Security posture
We assume your security team will ask hard questions — so the answers are ready before you ask.
→ security whitepaper · data-flow diagram · CAIQ-lite questionnaire — available on request, before you have to ask.
Honest answers
No, and never. Conduit is self-hosted software — you run the containers in your own cloud account. We are not in the request path, we never see your prompts, code, or keys, and there's no telemetry or license callback. The subscription buys the license, updates, and support — not hosting.
On top, as the layer your security and finance teams asked for. LiteLLM is excellent at model aliasing, routing, fallbacks, and the broad provider matrix. It doesn't give you per-engineer attribution, hard budgets, egress governance with a per-org entity allowlist, an auditor-ready CSV export, or context-rot observability — Conduit does. Point your tools at Conduit, point Conduit's upstream at LiteLLM (one env var: UPSTREAM_ANTHROPIC_URL=http://litellm:4000), no client changes, no LiteLLM changes. Full 5-minute walkthrough in the repo: docs/run-with-litellm.md.
No. Conduit is a transparent proxy: request bodies aren't modified, streaming (SSE) passes through untouched with metering off the hot path, and a disallowed model is a clear 403 — never a silent downgrade. Overhead is under 5ms. The context-rot panel measures where you're paying for rot; we deliberately do not rewrite prompts (that would bust Anthropic's prefix cache and break the byte-transparent promise).
Honest answer: not yet, and we won't claim a certificate we don't have. Conduit is on-prem software you run inside your perimeter, so most compliance properties (SOC 2 control coverage, BAAs, residency) are inherited from your environment — your cloud account, your secret manager, your IdP. What we do provide for your security review: this security posture summary, a CAIQ-lite / SIG-lite questionnaire, cosign-signed images + a CycloneDX SBOM per release, the privacy guarantees described above (no prompt/completion storage, AES-256-GCM-sealed provider keys, metadata-only audit log), and direct engineering access. SOC 2 Type II is on the roadmap when design-partner traffic + revenue make a formal audit make sense — and we'll announce it the day the report exists, not before.
You run Conduit free for 90 days in your own VPC (Docker Compose or K8s — clean install doc included). You get the full product and direct access to the builder; we get feedback and the chance to shape contextual governance against real traffic. No data leaves your account. Cancel anytime, keep the learnings.
The ask
For regulated, code-sensitive engineering teams (50–500 engineers) who want AI coding agents with a defensible "yes" — fintech, healthcare, GCCs.