ByteSpike's roadmap: one key, all the world's models

Where ByteSpike is headed over the next four quarters. Four phases, each a concrete addition rather than a vague north star.

May 8, 2026KL6 min read

Today's launch ships the catalog: 28 chat models, 9 image, 8 video, all behind one Anthropic-compatible key with public per-model rates. That's the foundation. Here's what comes next.

Phase 1 (Q3 2026): organisational sub-keys + per-member quotas

Master keys can already derive sub-keys with scope and IP allowlist. The next step is per-member spend caps that reset monthly, plus a console dashboard for visualising spend by sub-key over time. Org admins shouldn't need to write SQL to know which service is the budget hog.

Phase 2 (Q4 2026): self-hosted SKU + audit log + SSO

Enterprise tier ships a self-hosted deployment SKU (Docker / Kubernetes), a queryable audit log of every gateway call (model / sub-key / user / outcome), and SSO via OIDC. The pitch: same wire protocol as bytespike.ai, run in your VPC, audit log retained for the contract length.

Phase 3 (Q1 2027): embedding + rerank + audio under one key

We launch with text / image / video. The next family additions are embedding (OpenAI / Cohere / Voyage), rerank (Cohere / Jina), and audio in/out (Whisper / ElevenLabs / OpenAI realtime). Same key, same billing, same `failures-don't-bill` policy.

Phase 4 (Q2 2027): provider-direct compliance routing

Today we route through one or more upstream providers per model. Phase 4 ships customer-selectable provider-direct routing for compliance-sensitive workloads — `gpt-image-2-official` is the prototype. Same model weights, audit-trail attribution to the named provider, premium tier reflecting the routing premium.

What we're not building

We're not building a chat UI. DOSIA is the desktop client, console.bytespike.ai is the org admin surface. We're not building model fine-tuning — that work belongs at the model vendor, and ByteSpike's job is to route to whatever you fine-tune. We're not building a marketplace for third-party models — every model in the catalog goes through the same vetting and SLA.

Try ByteSpike at bytespike.ai. The full rate card is at docs.bytespike.ai/pricing. If you'd like to talk roadmap with us — especially Phase 2 self-hosted SKU — drop a note via /enterprise.