The Anthropic-protocol bet
We picked the Anthropic Messages API as our default protocol. Here's why we think tool_use, cache_control, and thinking are the API surface most worth keeping pristine — and how we shim everything else without losing fidelity.
Every gateway has to pick a default protocol, and that choice quietly determines what your users can and can't express. We picked Anthropic Messages.
Why this protocol
- tool_use blocks are first-class — not bolted on as a function-call adapter. The model gets to interleave reasoning, tools, and prose in a single stream.
- cache_control is explicit. You decide what gets cached and for how long. No hidden auto-cache making your bill look weird.
- thinking blocks are separate from output. You can show or hide them. You can bill them or not. Either way the wire format tells you what's what.
What about OpenAI users?
We expose Chat Completions on the same base URL. The shim is loss-y by definition — you can't model thinking blocks faithfully in a function-call schema — but for code that already speaks Chat Completions, the upgrade path is one line: change the base URL and a key.
If you're starting fresh, talk to us in Anthropic-shape. Your future self will thank you when the model can finally tell you it's halfway through three tool calls.