Response-Header Reference
Every /v1/chat/completions response carries a set of SPASS-* headers that
expose stack-internal state in a machine-friendly way. They follow the
convention from ADR 0006:
proper-cased on the wire (Spass-Augment-Applied), case-insensitive per
RFC 9110, no X--prefix per RFC 6648, structured-field-values per RFC 8941.
Two categories: request-side (what the caller can send) and
response-side (what the server emits).
Request-side headers
| Header | Type | Purpose |
SPASS-Augment | RFC 8941 dictionary | Caller-controlled server-side augmentation. Format: system-prompt=off, server-tools=off, memory=off. Each key is default (= aktive injection) or off. Default-Verhalten ohne header: alle drei aktiv. |
SPASS-Tenant-Id | string | Multi-tenant override (only valid for unbound tokens). |
SPASS-Scope-Id | string | Memory + system-prompt scope-binding. |
SPASS-User-Id | string | Memory + system-prompt user-binding. |
SPASS-Stt-Mode | enum chunked | continuous | STT-File-Upload mode (ADR 0005 v2). Only on /v1/audio/transcriptions. |
SPASS-Request-Id | UUID v4 | Caller-supplied correlation id. Server echoes it back in the response. Dual-accept with X-Request-Id. |
Response-side headers
Request-correlation
| Header | Format | Notes |
SPASS-Request-Id | UUID v4 | Internal SPASS form. Always set. Cross-Layer-Tracing: propagiert sich als parent_request_id in audit-events von /a1-sub-calls und ist in jedem audit.jsonl-record als top-level Feld. Caller können den Header echo'en (Bearer-trace-Korrelation). |
X-Request-Id | UUID v4 | RFC-de-facto compat alias, same value as SPASS-Request-Id. |
Caller-Tip (Cut 2.23c+): Wenn ihr einen X-Request-Id-Header zum Server schickt, übernimmt der Server diesen Wert für audit.jsonl — euer eigener Trace-Identifier ist dann in unseren Logs für post-mortem-Korrelation findbar. Bei 5xx-Errors wird der request_id im dgx_code-Feld der Error-Response mitgeliefert (siehe errors.md).
Augmentation visibility (ADR 0006)
| Header | Format | Notes |
SPASS-Augment-Applied | RFC 8941 dictionary | system-prompt=Nitems, server-tools=<comma-list-or-off>, memory=Nitems, stream-usage=injected (when F.1 default-injected include_usage). Lets the caller audit what the server actually did. |
SPASS-Applied | comma-list | Silent adjustments applied to the request (e.g. max_tokens_floored=200, response_format_stripped=empty). |
SPASS-Tools-Executed | comma-list name:iter,… | Server-side tool-loop trace: which tools fired in which iteration. |
SPASS-Stt-Model | string | Cut 2.50 — effektiv genutztes STT-Modell auf /v1/audio/transcriptions (resolved Slug, nie auto). Nur dgx-intern/Debug — der nachgelagerte Caller-Proxy verwirft Upstream-Response-Header. Für convert=1 ist das JSON-Body-Feld model maßgeblich. |
Resolution + routing (ADR 0007 + 0008)
| Header | Values | Notes |
SPASS-Resolved-Model | canonical alias | Non-streaming: zuverlässig. Streaming: best-effort. |
SPASS-Resolved-Backend | local-fp4 | local-fp8 | local-bf16 | cloud-1 | cloud-2 | cloud-3 | Generic backend-slot, ADR 0006 v2 compliant — keine implementation-Names. |
SPASS-Resolved-Reason | primary-up | primary-down-fallback | quant-pin-explicit | cloud-pin-explicit | hardware-pin-explicit | legacy-alias | unknown | Erklärt warum der resolved-backend der ist der er ist. Cockpit-followup-7 wishlist. |
SPASS-Fallback-Used | true | false | unknown | Aus actual != primary abgeleitet. unknown bei streaming oder unbekanntem Modell. |
SPASS-Cache-Hit | true | false | Stack-cache (Redis) hit. |
Cost-Pipeline V2 (ADR 0010)
| Header | Format | Notes |
SPASS-Cost-Eur | 0.01 (2 decimals, ceil_to_cent) | Final EUR mit allen markups. Tenant-billing-authoritative. |
SPASS-Cost-Usd | 0.01 (2 decimals, round_to_cent) | USD = EUR ÷ ECB-rate (ohne markups). Display-only. |
SPASS-Cost-Available | true | false | Ob ein cost-Wert ableitbar war. false nur bei unknown source. |
SPASS-Cost-Source | zero | free | upstream | unknown | Woher der Wert kommt. |
SPASS-Cost-Exchange-Rate | 0.9300 (4 decimals) | ECB-rate (ohne markup) für caller-Verifikation. |
SPASS-Cost-Exchange-Rate-Source | ecb-YYYY-MM-DD | fallback-30d-max-... | fallback-hardcoded | Lookup-Hierarchie-Indikator. |
SPASS-Cost-Sub-Calls | integer (≥0) | /a1 only. How many internal /v1 sub-calls contributed to the aggregated cost. ≥1 for normal completions; can be larger for tool-loops. |
On /a1/agents/<name>/chat and /a1/agents/<name>/sessions/<sid>/messages, all of the above are summed across the rig agent's internal /v1 sub-calls (Cut 2.7). The dominant Source wins (Upstream > Free > Zero > Unknown); Available is false if ANY sub-call was Unknown.
Recursion + audit-correlation (Cut 2.3)
| Header | Format | Notes |
SPASS-Caller-Depth | integer (in-only) | Optional incoming header. The /a1 handler refuses requests at ≥ 3 (recursion_depth_exceeded). The handler propagates incoming + 1 on its rig sub-call. |
SPASS-Parent-Request-Id | UUID v4 (in-only) | Set by /a1 on the rig sub-call so /v1's audit-event carries parent_request_id = outer-rid. Lets jq reconstruct call-trees from audit.jsonl. |
/c1/chat only
| Header | Format | Notes |
SPASS-Conversation-Id | UUID v4 | Persistent conversation ID. Caller submits it on follow-up turns to load history. |
spass-system-prompt-ignored | already_present (Cut 2.33, CR-0003) | Set when the request carries a system-prompt field (system_prompt / system_prompt_ref / additional_system_prompt) but the conversation already has a role=system-row from an earlier turn. Server ignores the new value; this header is the explicit signal so the caller can debug why their late system-prompt change didn't take effect. Use POST /c1/conversations/{id}/system-prompt to append a new system-row deliberately. |
Quota signaling (Cut 2.32, CR-0005)
| Header | Format | Notes |
Retry-After | integer (seconds) | Set on HTTP 429 responses with dgx_code: "openrouter_daily_quota_exhausted". Value = seconds until next 00:00 UTC. Standard RFC 9110 retry-hint. |
Caller-pattern: cost-tracking
function requestCostEur(headers: Headers): number | null {
if (headers.get("spass-cost-available") !== "true") return null;
return parseFloat(headers.get("spass-cost-eur") ?? "0");
}
Local-zero responses set Spass-Cost-Eur: 0.00 + Spass-Cost-Source: zero
explicitly so the caller doesn't need to special-case missing headers.
Caller-pattern: routing-diagnose
const backend = headers.get("spass-resolved-backend");
const reason = headers.get("spass-resolved-reason");
if (reason === "primary-down-fallback") {
// T1 unprefixed alias fell to cloud — log + maybe retry differently
}
Spass-Resolved-Reason removes the need to compare Spass-Resolved-Backend
to the catalog manually — the server already did that.
Cross-references
- ADR 0006 — naming convention
- ADR 0007 — cost + backend-resolution V1
- ADR 0008 — slug tier-schema (T1/T2/T3/T4)
- ADR 0010 — cost-pipeline V2 (EUR + tenant-markup)
/v1/info — model catalog with primary backend + capabilities
/api/version — current build SHA + timestamp