Authentication
All authenticated endpoints accept a single Bearer token, configured server-side
as RUST_API_BEARER in .env. The token is opaque (random 32-byte hex by
convention) and compared constant-time to defeat timing attacks.
Authorization: Bearer <token>
Send it on every request to:
/v1/chat/completions/v1/models/v1/info/v1/embeddings/v1/images*/v1/audio/transcriptions,/v1/audio/translations/v1/tools/execute/v1/memory*,/v1/system-prompts*/v1/tenant/config*(Cut 2.13 — needstenant_config:read|writescope)/c1/chat/c1/conversations*/a1/agents*,/a1/rag/indices*
These endpoints are public (no Bearer required):
//healthz/readyz/openapi,/openapi.json,/redoc/docs/*/errors/playground
Errors
| Code | Status | When |
|---|---|---|
missing_authorization | 401 | No Authorization header |
invalid_authorization | 401 | Bearer doesn't match server config |
rate_limit_exceeded | 429 | Too many requests for this token (see below) |
# Without auth — gets 401:
curl -i https://dgx.spass.fun/v1/info
# With auth — 200 OK:
curl -s -H "Authorization: Bearer $BEARER" https://dgx.spass.fun/v1/info \
| jq '.models[0].alias'
Rate limits
Per-token rate limits are enforced via tower_governor:
- Default: 1 request/second, burst of 30.
- Bucket key: SHA-256 hash of the bearer token — not the IP. Distinct tokens get distinct buckets even from the same IP; the same token shares one bucket across all IPs.
- Tunable server-side:
RATE_LIMIT_PER_SECOND,RATE_LIMIT_BURSTenv vars.
When you exceed the limit you get HTTP 429 with the standard error envelope:
{
"error": {
"type": "rate_limit_exceeded",
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded"
}
}
End-User Identity (Header-only, ADR 0016 / Cut 2.23c)
SPASS-User-Id is the sole authority for end-user identity across all
user-scoped endpoints (/c1/*, /v1/chat/completions, /v1/memory/*,
/v1/system-prompts/*, /a1/agents/*/sessions/*).
Mandatory — endpoints reject requests without it with HTTP 400
missing_field, param: headers.spass-user-id. Exception: tokens with
role tenant_admin may omit the header for read-only endpoints to get a
tenant-wide view (no user-id filter).
Body and query user_id are forbidden — sending either returns HTTP
400 invalid_field with param: body.user_id or query.user_id.
POST /c1/chat
Authorization: Bearer <token>
SPASS-User-Id: alice
Content-Type: application/json
{
"model": "llama-4-scout",
"conversation_id": "<id-owned-by-alice>",
"message": "...",
"stream": false
}
Requesting a conversation_id that belongs to a different end-user
returns HTTP 404 conversation_not_found (avoids existence-leak).
Agent sessions are user-scoped — /a1/agents/{name}/sessions/* filter
strictly on SPASS-User-Id. A user only sees their own sessions, even
when the bearer is shared across multiple end-users.
For the migration history (Cut 2.19 → ADR 0014 → ADR 0016 strict-mode
sweep), see docs/adr/0016-user-identity-authority-header-only.md.
Token rotation
Rotating the bearer is server-side only:
NEW=$(openssl rand -hex 32)
sed -i "s/^RUST_API_BEARER=.*/RUST_API_BEARER=$NEW/" /home/dietmar/dgx-llm/.env
docker compose restart rust-api
There's no on-the-fly token refresh — clients must be updated externally with the new value. For production deployments where multiple clients share a key, consider scripting this into your secrets pipeline.
Body size limit
Bodies larger than MAX_BODY_BYTES (default 32 MB) get HTTP 413 with
code: body_too_large. Triggered most often by very large base64-encoded
image inputs; resize the image or split the conversation.