Two presets isn't a feature. It's a default.
ByteSpike ships with two curated model presets — Global and China — and every account opens to one of them already applied. We get asked why we didn't just let admins pick from 30 models on day one. Here's the rationale.
The first version of permission templates we built didn't have presets. The admin opened a blank editor, saw a list of 30+ models with capability tags, and was supposed to know which combination they wanted. We watched four pilot teams open that screen, scroll, and close the tab.
Not because the editor was bad. Because picking a stack from scratch is a real engineering decision — what's the agent core, what's the cheap router, do you need image generation, do you trust DALL·E or Seedream more — and the admin opening the screen was a Finance lead onboarding their team. They were the wrong person to make that call on a Tuesday afternoon.
Two presets, not twenty
We landed on two — Global and China — for the same reason most product-config UIs land on two: anything more becomes a different decision problem. With two you're picking which side of a line you're on; with five you're back to ranking the options against each other and the cold start returns.
The split isn't political. It's that models routinely deployed together tend to be from a shared origin. Teams running English-first SaaS workloads converge on Claude + GPT + Gemini + DeepSeek; teams operating under PRC compliance or with PII-sensitive data converge on DeepSeek + Doubao + GLM + Kimi + MiniMax. The presets just write down what people were already doing.
Four buckets, not nine
Inside each preset the models bucket into four product-language categories. Not the eight or nine fine-grained capabilities the gateway reports — those are useful for accounting but they're not how anyone explains what an LLM does to a non-technical teammate.
- 主脑 / Agent core — the model your agent loop drives. Chat, tool use, reasoning. The one that costs real money when it's the wrong pick.
- 识图 / Vision — models that read images. Captioning, OCR, screenshot questions. Pair with an agent for full multimodal.
- 图像生成 / Image generation — models that write images. Asset preview, ad creative variations, brief illustrations.
- 外脑 / Auxiliary — video gen, embeddings, TTS, STT. The "needed sometimes, not the spine" parts of a multimodal stack.
Three buckets felt too coarse (it merged vision into agent and lost the "I want OCR but not chat" use case). Five felt arbitrary (the extras tended to split off TTS or video into its own line, which then sat empty on most accounts). Four turned out to be the smallest set where every bucket had something in it for every team we onboarded.
Presets compose; they don't lock you in
A preset is the starting point, not the contract. Apply Global, then swap in DeepSeek Flash for cheap chat because your team is China-based but ships English-first products. Apply China, then add Claude Sonnet for the legal review function that has its own compliance approval. Custom templates save as "engineering-pack" / "marketing-pack" / whatever your team calls it, and per-member overrides layer on top of the template.
The gateway intersects three things at request time: the org's allowed model pool, the assigned template's models, and the per-member override (if any). Whichever is narrowest wins. So presets give you the first 80% of the decision; the remaining 20% is real refinement against your actual workflow, not yak-shaving in a config screen.
What's next
We're tracking three follow-on preset surfaces: a "Cheap mix" sorted by per-token cost (for high-volume classification / routing), a "Fast mix" sorted by p50 latency (for interactive agents), and per-account personal presets — your own named mix you can apply across projects without going through templates. None of those land before the first wave of customer feedback on the two we have. Send us yours.
If you want to look at the actual model lists: console.bytespike.ai/dosia/models shows your end-user view; console.bytespike.ai/org/templates shows the admin editor.