Error catalog
All error responses follow a stable envelope. The same shape is used by both
/v1/* and /c1/* endpoints.
{
"error": {
"type": "<category>",
"code": "<stable-code>",
"message": "<human-readable>",
"param": "<json-pointer>"
}
}
type— one of six top-level categories that mirror HTTP status semantics:invalid_request_error,authentication_error,permission_denied,not_found,rate_limit_exceeded,upstream_error,internal_error.code— stable machine-readable identifier. Once published, codes are never repurposed. Adding new codes is non-breaking.message— human-readable english.param(optional) — JSON-Pointer to the offending field, e.g.messages[0].content[1].image_url.url.
Machine-readable catalog
Pull the full catalog as JSON from GET /errors (no auth):
curl -s https://dgx-spark-4236.spass.fun/errors | jq '.entries[0]'
{
"code": "missing_authorization",
"type": "authentication_error",
"http_status": 401,
"title": "Authorization header is missing",
"description": "All `/v1/*` and `/c1/*` endpoints require a Bearer token.",
"remediation": "Add `Authorization: Bearer <RUST_API_BEARER>` to the request.",
"typical_param": "headers.authorization"
}
Use this to generate stable error-handling code in your client without hand-typing constants.
Authentication errors (401)
| Code | Cause | Fix |
|---|---|---|
missing_authorization | No Authorization header | Add Authorization: Bearer <token> |
invalid_authorization | Token mismatch | Verify RUST_API_BEARER matches server |
Rate-limit (429)
| Code | Cause | Fix |
|---|---|---|
rate_limit_exceeded | Per-token bucket empty | Back off and retry; tune RATE_LIMIT_* server-side |
Invalid request (4xx)
| Code | Status | Cause |
|---|---|---|
body_too_large | 413 | Body exceeds MAX_BODY_BYTES (default 32 MB) |
invalid_json | 400 | Body unparseable / schema mismatch |
missing_field | 400 | Required field absent |
invalid_field | 400 | Field value bad type / range / enum |
model_not_in_allowlist | 400 | model slug not whitelisted |
max_tokens_below_minimum | 400 | Even after auto-floor, value still rejected upstream |
image_url_not_supported | 400 | image_url.url is http(s):// — must be base64 data URI |
image_decode_error | 400 | Data URI malformed |
max_tokens_below_minimum — what really happens
Some upstreams reject low values: OpenAI requires max_output_tokens >= 16,
reasoning models need >= 200 to leave room for hidden reasoning tokens
before any visible content. The rust-api silently floors max_tokens
to the model's documented minimum and reports the adjustment via response
header:
x-rust-api-applied: max_tokens_floored=200
Only when even the floor would still be invalid does the error surface.
Read the per-model constraints.min_max_tokens from /v1/info:
curl -s -H "Authorization: Bearer $BEARER" https://dgx-spark-4236.spass.fun/v1/info \
| jq '.models[] | {alias, min_max_tokens: .constraints.min_max_tokens}'
image_url_not_supported — what really happens
Cloud providers (Anthropic / Google / OpenAI / xAI / OpenRouter) refuse to
fetch arbitrary URLs server-side; the local Llama-4-Scout vLLM doesn't either.
The rust-api validates against constraints.accepts_image_url before
forwarding and rejects up-front so you get a clear param pointer instead of
an opaque 400 from the upstream.
Encode your image as a base64 data URI:
B64=$(base64 -w 0 image.jpg)
curl -s -H "Authorization: Bearer $BEARER" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"llama-4-scout\",
\"messages\": [{
\"role\": \"user\",
\"content\": [
{\"type\": \"text\", \"text\": \"Describe this\"},
{\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/jpeg;base64,$B64\"}}
]
}]
}" \
https://dgx-spark-4236.spass.fun/v1/chat/completions
Not found (404)
| Code | Cause | Fix |
|---|---|---|
conversation_not_found | /c1 — conversation doesn't exist or belongs to another user_id | Omit conversation_id to start fresh; or list with GET /c1/conversations |
route_not_found | Path/method combination unknown | Check /openapi.json |
Upstream (5xx)
| Code | Status | Cause |
|---|---|---|
upstream_error | 502 | LiteLLM/vLLM/cloud answered non-2xx — body inlined for debugging |
upstream_timeout | 504 | Hit HTTP_TOTAL_TIMEOUT_SECS — most often gpt-image (100-180 s) |
upstream_unavailable | 503 | TCP/TLS to LiteLLM or vLLM failed — check /readyz and docker ps |
Internal (500)
| Code | Cause | Fix |
|---|---|---|
internal_error | Server-side bug or panic | Retry; check server logs with x-request-id |
storage_error | SQLite read/write failed | Server-side: chown -R 65532:65532 data/sqlite && docker restart dgx-rust-api |
Recommended client pattern
import httpx
def call_gateway(payload: dict) -> dict:
r = httpx.post(
"https://dgx-spark-4236.spass.fun/v1/chat/completions",
headers={"Authorization": f"Bearer {BEARER}"},
json=payload,
timeout=240, # cover gpt-image worst case
)
if r.is_error:
body = r.json().get("error", {})
code = body.get("code", "unknown")
if code == "rate_limit_exceeded":
time.sleep(2); return call_gateway(payload)
if code == "image_url_not_supported":
# rewrite image_url to base64 and retry
...
raise GatewayError(code, body.get("message"), body.get("param"))
# honour silent adjustments
if applied := r.headers.get("x-rust-api-applied"):
log.info("server floored: %s", applied)
return r.json()