Every LLM call in the platform goes through a single central inference function. No stage, extractor, or chat handler calls a model directly.That one choke point is what makes everything below possible.
The inference log
Every call through the central function is recorded:| Field | Why it matters |
|---|---|
| Model | Which model produced this output: fast-tier or frontier. |
| Prompt version | Prompts are versioned; a logged call pins the exact template, not âwhatever the prompt was at the time.â |
| Prompt hash | Detects drift even within a version. |
| Token counts | Per-call cost accounting, aggregable per stage, per run, per org. |
| Latency | Wall-clock per call. |
| Input event IDs | The provenance link: the precise set of events the model reasoned over. |
Tracing
On top of the log, every call emits a trace to a self-hosted tracing instance (Langfuse). Observability data, including full prompts and responses, never leaves Seyn infrastructure. Each extraction run produces an end-to-end trace with one span per LLM call: organisation and run context, stage name, prompt version, model, full prompts and responses, token counts, and latency.Tracing is best-effort by contract: if the tracing backend is unreachable, spans are dropped and the run completes normally. Observability must never become an availability dependency for the thing it observes.
Versioned prompts
Prompt templates are versioned and reviewed like any other change. A rule extracted six months ago can still be traced to the exact prompt text that produced it. There is no âprompt was edited in a dashboard somewhereâ failure mode.Cost controls
- Model routing per stage. High-volume stages run on a fast, cheap model tier; only synthesis gets the frontier model.
- Prompt caching. Stable prompt prefixes are cached at the provider, cutting repeat-batch input cost to roughly a tenth.
- Per-stage token budgets. Each stage operates under an explicit budget; oversized corpora are chunked rather than blowing through it.
Related
Provenance
How the inference log anchors the audit chain.
Knowledge
The extraction stages whose calls all flow through here.