DGX LLM Chat Gateway

Error catalog

All error responses follow a stable envelope. The same shape is used by both /v1/* and /c1/* endpoints.

{
  "error": {
    "type":    "<category>",
    "code":    "<stable-code>",
    "message": "<human-readable>",
    "param":   "<json-pointer>"
  }
}

Machine-readable catalog

Pull the full catalog as JSON from GET /errors (no auth):

curl -s https://dgx-spark-4236.spass.fun/errors | jq '.entries[0]'
{
  "code": "missing_authorization",
  "type": "authentication_error",
  "http_status": 401,
  "title": "Authorization header is missing",
  "description": "All `/v1/*` and `/c1/*` endpoints require a Bearer token.",
  "remediation": "Add `Authorization: Bearer <RUST_API_BEARER>` to the request.",
  "typical_param": "headers.authorization"
}

Use this to generate stable error-handling code in your client without hand-typing constants.

Authentication errors (401)

CodeCauseFix
missing_authorizationNo Authorization headerAdd Authorization: Bearer <token>
invalid_authorizationToken mismatchVerify RUST_API_BEARER matches server

Rate-limit (429)

CodeCauseFix
rate_limit_exceededPer-token bucket emptyBack off and retry; tune RATE_LIMIT_* server-side

Invalid request (4xx)

CodeStatusCause
body_too_large413Body exceeds MAX_BODY_BYTES (default 32 MB)
invalid_json400Body unparseable / schema mismatch
missing_field400Required field absent
invalid_field400Field value bad type / range / enum
model_not_in_allowlist400model slug not whitelisted
max_tokens_below_minimum400Even after auto-floor, value still rejected upstream
image_url_not_supported400image_url.url is http(s):// — must be base64 data URI
image_decode_error400Data URI malformed

max_tokens_below_minimum — what really happens

Some upstreams reject low values: OpenAI requires max_output_tokens >= 16, reasoning models need >= 200 to leave room for hidden reasoning tokens before any visible content. The rust-api silently floors max_tokens to the model's documented minimum and reports the adjustment via response header:

x-rust-api-applied: max_tokens_floored=200

Only when even the floor would still be invalid does the error surface. Read the per-model constraints.min_max_tokens from /v1/info:

curl -s -H "Authorization: Bearer $BEARER" https://dgx-spark-4236.spass.fun/v1/info \
  | jq '.models[] | {alias, min_max_tokens: .constraints.min_max_tokens}'

image_url_not_supported — what really happens

Cloud providers (Anthropic / Google / OpenAI / xAI / OpenRouter) refuse to fetch arbitrary URLs server-side; the local Llama-4-Scout vLLM doesn't either. The rust-api validates against constraints.accepts_image_url before forwarding and rejects up-front so you get a clear param pointer instead of an opaque 400 from the upstream.

Encode your image as a base64 data URI:

B64=$(base64 -w 0 image.jpg)
curl -s -H "Authorization: Bearer $BEARER" \
     -H "Content-Type: application/json" \
     -d "{
       \"model\": \"llama-4-scout\",
       \"messages\": [{
         \"role\": \"user\",
         \"content\": [
           {\"type\": \"text\", \"text\": \"Describe this\"},
           {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/jpeg;base64,$B64\"}}
         ]
       }]
     }" \
     https://dgx-spark-4236.spass.fun/v1/chat/completions

Not found (404)

CodeCauseFix
conversation_not_found/c1 — conversation doesn't exist or belongs to another user_idOmit conversation_id to start fresh; or list with GET /c1/conversations
route_not_foundPath/method combination unknownCheck /openapi.json

Upstream (5xx)

CodeStatusCause
upstream_error502LiteLLM/vLLM/cloud answered non-2xx — body inlined for debugging
upstream_timeout504Hit HTTP_TOTAL_TIMEOUT_SECS — most often gpt-image (100-180 s)
upstream_unavailable503TCP/TLS to LiteLLM or vLLM failed — check /readyz and docker ps

Internal (500)

CodeCauseFix
internal_errorServer-side bug or panicRetry; check server logs with x-request-id
storage_errorSQLite read/write failedServer-side: chown -R 65532:65532 data/sqlite && docker restart dgx-rust-api

Recommended client pattern

import httpx

def call_gateway(payload: dict) -> dict:
    r = httpx.post(
        "https://dgx-spark-4236.spass.fun/v1/chat/completions",
        headers={"Authorization": f"Bearer {BEARER}"},
        json=payload,
        timeout=240,  # cover gpt-image worst case
    )
    if r.is_error:
        body = r.json().get("error", {})
        code = body.get("code", "unknown")
        if code == "rate_limit_exceeded":
            time.sleep(2); return call_gateway(payload)
        if code == "image_url_not_supported":
            # rewrite image_url to base64 and retry
            ...
        raise GatewayError(code, body.get("message"), body.get("param"))
    # honour silent adjustments
    if applied := r.headers.get("x-rust-api-applied"):
        log.info("server floored: %s", applied)
    return r.json()