How "failures don't bill" actually works
Every ByteSpike call carries a pre-flight credit reservation and a settle-on-success commit. Reservations expire if the upstream errors out, the user cancels, or the gateway times out — the account ledger only debits when there's an asset or a token stream that landed.
"Failures don't bill" is the line in every ByteSpike pricing page. People ask reasonably often: how is that not just marketing? The honest answer is that the gateway has to do real bookkeeping work to make it true, and the mechanism is a two-phase commit on the credit ledger that most other AI gateways skip.
Phase 1 — pre-flight reservation
When a request arrives, the gateway estimates a worst-case cost (max tokens × per-token rate for text, per-call rate for image/video) and reserves that amount against the caller's wallet. The reservation is a balance hold, not a debit. The customer's balance shows the held funds but the ledger doesn't yet have a charge line item. If the wallet doesn't have enough to cover the reservation, the request is rejected before any upstream call.
Phase 2 — settle on success only
The gateway then forwards the call upstream and watches what comes back. On a clean success — meaning a 2xx with usable content delivered to the caller — the gateway computes the actual cost from the real token count or the real generated asset, debits that amount from the reservation, and refunds any remaining balance. Account ledger now has one debit row for what actually happened.
What counts as a failure
- Upstream returns 5xx or a connection drops mid-stream → reservation expires, no debit
- Content moderation rejects the prompt (NSFW, copyright filter, jailbreak detector) → reservation expires, no debit
- Image / video generation completes upstream but the asset URL fails to deliver back to the caller (CDN issue, signed-URL expired) → reservation expires, no debit
- Caller cancels the streaming connection before the first content token arrives → reservation expires, no debit
- Gateway internal timeout (default 8s on text, longer on video) without any upstream response → reservation expires, no debit
Where this hurts (and why we still do it)
Two-phase commit on every call is not free. We pay for upstream calls that fail audit-side. We have to keep a reservation ledger keyed by request ID for the duration of the longest tolerated upstream call (~5min for the slowest video models). Reservations that the upstream eventually answered successfully but the customer's connection had already dropped before the final byte arrived have to be reconciled async. None of this is technically interesting but all of it is engineering work other gateways skip.
We do it because the alternative — charging on submit-then-credit-back — is a worse customer experience and a worse audit-log story. If your invoice has a credit-back line item for every 5xx you got that month, you're going to ask why we charged you in the first place. The two-phase commit means the line never appears.
“The cheapest place to handle a failure is before the credit moved.”
How to see this in your account
Console → Usage tab shows three columns per row: requested, reserved (held during pre-flight), settled. For a successful call the three are equal modulo the worst-case → actual delta. For a failed call you see a reserved column with a value but the settled column shows zero — and that row never carries a debit. The /v1/balance endpoint reflects only settled amounts; held reservations don't decrease your visible balance until either settle or expiry resolves.
So if you're running an experiment that hammers a flaky upstream — say, evaluating a new image model with a strict moderation policy — your balance won't drift. You will see reserved spikes and immediate releases, but the bottom line stays put unless something actually shipped.