Authentication
All authenticated endpoints accept a single Bearer token, configured server-side
as RUST_API_BEARER in .env. The token is opaque (random 32-byte hex by
convention) and compared constant-time to defeat timing attacks.
Authorization: Bearer <token>
Send it on every request to:
/v1/chat/completions/v1/models/v1/info/c1/chat/c1/conversations*
These endpoints are public (no Bearer required):
//healthz/readyz/openapi,/openapi.json,/redoc/docs/*/errors/playground
Errors
| Code | Status | When |
|---|---|---|
missing_authorization | 401 | No Authorization header |
invalid_authorization | 401 | Bearer doesn't match server config |
rate_limit_exceeded | 429 | Too many requests for this token (see below) |
# Without auth — gets 401:
curl -i https://dgx-spark-4236.spass.fun/v1/info
# With auth — 200 OK:
curl -s -H "Authorization: Bearer $BEARER" https://dgx-spark-4236.spass.fun/v1/info \
| jq '.models[0].alias'
Rate limits
Per-token rate limits are enforced via tower_governor:
- Default: 1 request/second, burst of 30.
- Bucket key: SHA-256 hash of the bearer token — not the IP. Distinct tokens get distinct buckets even from the same IP; the same token shares one bucket across all IPs.
- Tunable server-side:
RATE_LIMIT_PER_SECOND,RATE_LIMIT_BURSTenv vars.
When you exceed the limit you get HTTP 429 with the standard error envelope:
{
"error": {
"type": "rate_limit_exceeded",
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded"
}
}
Conversation isolation (/c1 only)
The /c1/chat endpoint accepts an optional user_id field. When set,
conversations are scoped to that user — different user_ids cannot see or
mutate each other's conversation_ids. When omitted, conversations are
global to anyone holding the bearer token.
POST /c1/chat
{
"model": "llama-4-scout",
"user_id": "alice",
"conversation_id": "<id-owned-by-alice>",
"message": "...",
"tools": [],
"tool_choice": "auto",
"response_format": {"type": "text"},
"stream": false
}
Requesting a conversation_id that exists but belongs to a different
user_id returns HTTP 404 with code: conversation_not_found (rather than
403, to avoid leaking the existence of the resource).
Token rotation
Rotating the bearer is server-side only:
NEW=$(openssl rand -hex 32)
sed -i "s/^RUST_API_BEARER=.*/RUST_API_BEARER=$NEW/" /home/dietmar/dgx-llm/.env
docker compose restart rust-api
There's no on-the-fly token refresh — clients must be updated externally with the new value. For production deployments where multiple clients share a key, consider scripting this into your secrets pipeline.
Body size limit
Bodies larger than MAX_BODY_BYTES (default 32 MB) get HTTP 413 with
code: body_too_large. Triggered most often by very large base64-encoded
image inputs; resize the image or split the conversation.