Error catalog
All error responses follow a stable envelope. The same shape is used by both
/v1/* and /c1/* endpoints.
{
"error": {
"type": "<category>",
"code": "<stable-code>",
"message": "<human-readable>",
"param": "<json-pointer>"
}
}
type— one of six top-level categories that mirror HTTP status semantics:invalid_request_error,authentication_error,permission_denied,not_found,rate_limit_exceeded,upstream_error,internal_error.code— stable machine-readable identifier. Once published, codes are never repurposed. Adding new codes is non-breaking.message— human-readable english.param(optional) — JSON-Pointer to the offending field, e.g.messages[0].content[1].image_url.url.
Machine-readable catalog
Pull the full catalog as JSON from GET /errors (no auth):
curl -s https://dgx.spass.fun/errors | jq '.entries[0]'
{
"code": "missing_authorization",
"type": "authentication_error",
"http_status": 401,
"title": "Authorization header is missing",
"description": "All `/v1/*` and `/c1/*` endpoints require a Bearer token.",
"remediation": "Add `Authorization: Bearer <RUST_API_BEARER>` to the request.",
"typical_param": "headers.authorization"
}
Use this to generate stable error-handling code in your client without hand-typing constants.
Cross-link from JSON to docs
Each entry below has a stable HTML-anchor matching its code. The
machine-readable JSON at /errors can be cross-linked into
this page via /docs/errors#<code>. Example: a failed call returning
{"code":"image_base64_invalid"} jumps to
/docs/errors#image_base64_invalid. Anchors do not change once published.
Authentication errors (401)
| Code | Cause | Fix |
|---|---|---|
missing_authorization | No Authorization header | Add Authorization: Bearer <token> |
invalid_authorization | Token mismatch | Verify RUST_API_BEARER matches server |
missing_authorization
Add Authorization: Bearer <RUST_API_BEARER> to the request.
invalid_authorization
The Bearer token did not match any token in the multi-tenant catalog. The
comparison is constant-time. The same code is also returned when an
X-Tenant-Id header is presented that the matched token is not bound to.
Rate-limit / Quota (429)
| Code | Cause | Fix |
|---|---|---|
rate_limit_exceeded | Per-token bucket empty | Back off and retry; tune RATE_LIMIT_* server-side |
model_quota_exhausted | Model temporarily out of quota / credits / daily budget | Pick a different model alias or wait for the budget window to reset |
openrouter_daily_quota_exhausted | OpenRouter-specific daily limit hit on a tenant alias | Use a fallback alias (Claude/Gemini); response carries Retry-After until next 00:00 UTC |
rate_limit_exceeded
Per-token rate limits (default 1 request/sec, 30 burst). Bucket is keyed on SHA-256 of the bearer, not IP.
model_quota_exhausted
Cut 2.23d (2026-05-04). The chosen model is temporarily unavailable for this tenant because some upstream budget — token quota, daily credit limit, or rate window — has been used up. The body is intentionally provider-agnostic per ADR 0006 v2: no upstream-name, no token counts, no billing URLs. Identical caller-facing shape to other rate-limit responses.
Fix options for the caller:
- Pick a different model alias (e.g. swap
openai-gpt-latest→anthropic-claude-opus-latest). The cross-vendor fallback chain (router_settings.fallbacksinlitellm/config.yaml) tries this automatically before emittingmodel_quota_exhausted; only if the entire cascade is out you see this code. - Wait for the budget window. Daily-credit windows typically reset at the start of the next UTC day.
Operator action: raise the per-tenant budget in tokens.yaml if affordable, or top up with the upstream provider directly.
openrouter_daily_quota_exhausted
Cut 2.32 (2026-05-16, CR-0005). Tenant-Alias (z.B. godelmann-gocreate-premium-gpt-text-premium) hat sein OpenRouter-Tageslimit erschöpft. Anders als beim generischen model_quota_exhausted (Cut 2.23d) ist hier der Provider explizit identifiziert weil der Alias direkt openrouter-gepinnt ist (kein cross-vendor-Fallback im Stack-Pfad).
Response-Shape:
HTTP/2 429
Retry-After: 32400
content-type: application/json
{
"error": {
"type": "rate_limit_error",
"code": "openrouter_daily_quota_exhausted",
"message": "GPT-Tageslimit über OpenRouter erschöpft. Reset um 00:00 UTC. Verfügbare Fallback-Modelle: …"
},
"dgx_code": "openrouter_daily_quota_exhausted",
"available_fallbacks": ["godelmann-gocreate-premium-claude-text-premium", "godelmann-gocreate-premium-gemini-text-premium"]
}
Caller-Pattern: auf dgx_code: "openrouter_daily_quota_exhausted" mappen, dem User "GPT-Tageslimit erreicht, probiere Claude oder Gemini" zeigen, Retry-After für ein Auto-Retry nach Mitternacht UTC nutzen.
Tool-Loop Status-Codes (HTTP 200)
Diese Codes erscheinen im body als dgx_code-Feld (NICHT als HTTP-error), wenn der Server-side Tool-Loop einen besonderen Synthesis-Pfad nehmen musste. HTTP-Status ist 200 — der Caller bekommt eine vollständige narrative Antwort, das dgx_code ist nur ein Diagnose-Marker.
| Code | Bedeutung | finish_reason | content |
|---|---|---|---|
tool_loop_max_iterations | Tool-Loop hat MAX_TOOL_ITERATIONS=10 erreicht | stop | narrative Synthese aus den bisher collected tool-results |
tool_loop_anti_loop_synthesised | Identische Tool-Call-Signatur 2x hintereinander erkannt | stop | narrative Synthese statt loop-fortsetzung |
Im Stream-Mode (stream: true) emittiert der Server zusätzlich ein named SSE-Event event: spass.tool-cap bzw. event: spass.tool-anti-loop mit JSON-Payload {code, iterations, synth_called} vor dem finalen content + [DONE].
tool_loop_max_iterations
Cut 2.32 (2026-05-16, CR-0001). Multi-Tool-heavy-Anfrage hat die per-request Tool-Iteration-Cap von 10 erreicht ohne ein finales narrative-stop. Statt einer raw tool_calls-Response (die historisch zu hängenden Frontend-UIs führte weil keine content ankam) macht der Server einen extra Synthesis-LLM-Call mit tools: [] und einem Prompt der die bisher collected tool-results enthält. Ergebnis: garantiert eine vollständige narrative Antwort.
tool_loop_anti_loop_synthesised
Cut 2.32 (2026-05-16, CR-0001). Server hat erkannt dass das LLM denselben Tool-Call mit identischer Signatur (name + args) zweimal hintereinander ausgeführt hat — typisches Llama-Loop-Verhalten bei missverstandener Tool-Anforderung. Synthesis-LLM-Call mit tools: [] durchbricht den Loop und erzwingt eine Antwort aus den bereits vorhandenen Tool-Outputs.
Invalid request (4xx)
| Code | Status | Cause |
|---|---|---|
body_too_large | 413 | Body exceeds MAX_BODY_BYTES (default 32 MB) |
invalid_json | 400 | Body unparseable / schema mismatch |
missing_field | 400 | Required field absent |
invalid_field | 400 | Field value bad type / range / enum |
model_not_in_allowlist | 400 | model slug not whitelisted |
max_tokens_below_minimum | 400 | Even after auto-floor, value still rejected upstream |
image_url_not_supported | 400 | image_url.url is http(s):// — must be base64 data URI |
image_decode_error | 400 | Data URI malformed |
image_base64_invalid | 400 | Inline base64 payload not parseable (S-series Item 2) |
tool_call_arguments_invalid | 400 | tool_calls[].function.arguments is non-parseable JSON-string (Layer 1) |
upstream_bad_request | 400 | Upstream rejected as 4xx — propagated verbatim instead of opaque 502 |
body_too_large
Bodies cap at MAX_BODY_BYTES (default 32 MB). Most often hit with very
large base64 image payloads.
invalid_json
The body could not be parsed as JSON. Validate against /openapi.json.
missing_field
Required field absent. The param field of the error envelope shows
which one.
invalid_field
A field's value did not match expected type/range/enum. See param for
which field.
model_not_in_allowlist
The model slug isn't whitelisted by the stack. Use one of the slugs
from /v1/info or /v1/models.
max_tokens_below_minimum
Some upstream cloud paths reject low values (max_output_tokens >= 16),
reasoning models need >= 200 to leave room for hidden reasoning tokens
before any visible content. The rust-api silently floors max_tokens
to the model's documented minimum and reports the adjustment via response
header:
spass-applied: max_tokens_floored=200
Only when even the floor would still be invalid does the error surface.
Read the per-model constraints.min_max_tokens from /v1/info:
curl -s -H "Authorization: Bearer $BEARER" https://dgx.spass.fun/v1/info \
| jq '.models[] | {alias, min_max_tokens: .constraints.min_max_tokens}'
image_url_not_supported
Cloud providers refuse to fetch arbitrary URLs server-side; local
inference doesn't either. The rust-api validates against
constraints.accepts_image_url before forwarding and rejects up-front
so you get a clear param pointer instead of an opaque 400 from the
upstream.
Encode your image as a base64 data URI:
B64=$(base64 -w 0 image.jpg)
curl -s -H "Authorization: Bearer $BEARER" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"llama-4-scout\",
\"messages\": [{
\"role\": \"user\",
\"content\": [
{\"type\": \"text\", \"text\": \"Describe this\"},
{\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/jpeg;base64,$B64\"}}
]
}]
}" \
https://dgx.spass.fun/v1/chat/completions
image_decode_error
The base64-encoded data URI could not be parsed, the MIME type was missing,
or the decoded bytes were not a valid image. Verify format
data:image/<jpeg|png|webp|gif>;base64,<data>. Re-encode with
base64 -w 0 (no line wrapping).
image_base64_invalid
Pre-flight check: rust-api decodes every data:image/...;base64,<payload>
and rejects non-parseable base64 before forwarding upstream. Common
JS bug: passing a UTF-8 string through btoa corrupts non-ASCII bytes
— read as Uint8Array first. Output must be [A-Za-z0-9+/]+={0,2} only.
tool_call_arguments_invalid
OpenAI's tool-calling spec encodes function.arguments as a JSON-string
(e.g. "arguments":"{\"key\":\"value\"}"). rust-api parses each
argument-string with serde_json before forwarding upstream — non-parseable
JSON is caught here so the caller gets a clear diagnostic instead of an
opaque upstream Pydantic validation error. Cockpit-followup-6 (S.5)
defense-in-depth Layer 1.
upstream_bad_request
Upstream returned a 4xx (often 400 BadRequest) — typically caused by malformed payloads pre-flight didn't catch (corrupt base64, schema-validation error, content moderation flag). rust-api propagates the upstream status-code 1:1 instead of opaque 502, so the caller can distinguish "client-side-fixable" from "upstream-outage".
Not found (404)
| Code | Cause | Fix |
|---|---|---|
conversation_not_found | /c1 — conversation doesn't exist or belongs to another user_id | Omit conversation_id to start fresh; or list with GET /c1/conversations |
route_not_found | Path/method combination unknown | Check /openapi.json |
conversation_not_found
The supplied conversation_id was never persisted (or was deleted, or
belongs to a different user_id).
route_not_found
Path/method combination does not exist on this server.
Upstream (5xx)
| Code | Status | Cause |
|---|---|---|
upstream_error | 502 | An upstream provider answered non-2xx — body inlined for debugging |
upstream_timeout | 504 | Hit HTTP_TOTAL_TIMEOUT_SECS — most often gpt-image (100-180 s) |
upstream_unavailable | 503 | TCP/TLS to gateway/local-inference failed — check /readyz and docker ps |
upstream_error
A non-recoverable upstream 5xx. The server includes the (sanitised)
upstream message inline in message. Common causes: model rejected
oversize prompt, content moderation flag, provider-side outage.
upstream_timeout
Hit HTTP_TOTAL_TIMEOUT_SECS (default 600 s). Most often a slow image-
generation model (gpt-image regularly 100-180 s). Increase your client
timeout; check constraints.typical_response_seconds per model in
/v1/info. For gpt-image, set client timeout ≥ 240 s.
upstream_unavailable
Could not establish a TCP/TLS connection to the routing gateway or local
inference backend. Check /readyz and docker ps / docker logs.
Internal (500)
| Code | Cause | Fix |
|---|---|---|
internal_error | Server-side bug or panic | Retry; check server logs with x-request-id |
storage_error | SQLite read/write failed | Server-side: chown -R 65532:65532 data/sqlite && docker restart dgx-rust-api |
Recommended client pattern
import httpx
def call_gateway(payload: dict) -> dict:
r = httpx.post(
"https://dgx.spass.fun/v1/chat/completions",
headers={"Authorization": f"Bearer {BEARER}"},
json=payload,
timeout=240, # cover gpt-image worst case
)
if r.is_error:
body = r.json().get("error", {})
code = body.get("code", "unknown")
if code == "rate_limit_exceeded":
time.sleep(2); return call_gateway(payload)
if code == "image_url_not_supported":
# rewrite image_url to base64 and retry
...
raise GatewayError(code, body.get("message"), body.get("param"))
# honour silent adjustments
if applied := r.headers.get("x-rust-api-applied"):
log.info("server floored: %s", applied)
return r.json()