DGX LLM Chat Gateway

Authentication

All authenticated endpoints accept a single Bearer token, configured server-side as RUST_API_BEARER in .env. The token is opaque (random 32-byte hex by convention) and compared constant-time to defeat timing attacks.

Authorization: Bearer <token>

Send it on every request to:

These endpoints are public (no Bearer required):

Errors

CodeStatusWhen
missing_authorization401No Authorization header
invalid_authorization401Bearer doesn't match server config
rate_limit_exceeded429Too many requests for this token (see below)
# Without auth — gets 401:
curl -i https://dgx-spark-4236.spass.fun/v1/info

# With auth — 200 OK:
curl -s -H "Authorization: Bearer $BEARER" https://dgx-spark-4236.spass.fun/v1/info \
  | jq '.models[0].alias'

Rate limits

Per-token rate limits are enforced via tower_governor:

When you exceed the limit you get HTTP 429 with the standard error envelope:

{
  "error": {
    "type": "rate_limit_exceeded",
    "code": "rate_limit_exceeded",
    "message": "Rate limit exceeded"
  }
}

Conversation isolation (/c1 only)

The /c1/chat endpoint accepts an optional user_id field. When set, conversations are scoped to that user — different user_ids cannot see or mutate each other's conversation_ids. When omitted, conversations are global to anyone holding the bearer token.

POST /c1/chat
{
  "model": "llama-4-scout",
  "user_id": "alice",
  "conversation_id": "<id-owned-by-alice>",
  "message": "...",
  "tools": [],
  "tool_choice": "auto",
  "response_format": {"type": "text"},
  "stream": false
}

Requesting a conversation_id that exists but belongs to a different user_id returns HTTP 404 with code: conversation_not_found (rather than 403, to avoid leaking the existence of the resource).

Token rotation

Rotating the bearer is server-side only:

NEW=$(openssl rand -hex 32)
sed -i "s/^RUST_API_BEARER=.*/RUST_API_BEARER=$NEW/" /home/dietmar/dgx-llm/.env
docker compose restart rust-api

There's no on-the-fly token refresh — clients must be updated externally with the new value. For production deployments where multiple clients share a key, consider scripting this into your secrets pipeline.

Body size limit

Bodies larger than MAX_BODY_BYTES (default 32 MB) get HTTP 413 with code: body_too_large. Triggered most often by very large base64-encoded image inputs; resize the image or split the conversation.