Skip to main content
LLM-extracted knowledge has a credibility problem: “the model said so” doesn’t survive an audit, a compliance review, or a skeptical executive. Seyn’s answer is structural. Every piece of knowledge carries a complete, database-enforced chain of custody back to the raw records it was derived from. Not as a logging afterthought, but as the core schema design. Similarity is a guess. Provenance is a receipt.

The four hops

HopWhat it tells you
Rule → Inference logThe exact LLM call: model, versioned prompt, token counts, timestamp. Not “an LLM”; that call.
Inference log → EventsThe precise set of structured events the model saw as input.
Events → Raw recordsThe original payloads: the actual Teams/Slack message, email, or document, verbatim.
Raw record → ConnectorWhich system it came from and when it was ingested.
Provenance follows the source. The chain above is the extraction path; a fact extracted from message content cites the exact message and text span it came from, an interviewed claim cites the transcript turn, and a human edit cites the editor and the instruction they gave. Different sources, same standard: no claim without a citation. So for any rule you can answer, with receipts:
  • “Why does Seyn think we need CEO approval at $30M?” Here are the 23 deals it observed.
  • “Show me every message the model looked at when forming this claim.” Here they are, verbatim.
  • “Which version of which prompt produced this?” This one, on this date, with these token counts.

Enforced, not promised

Three design decisions make the chain trustworthy:
  1. First-class links. Every connection in the chain is a queryable, constrainable relation, not a free-text reference.
  2. Restricted deletion. You cannot delete a raw record that an event depends on, or an event that an inference depends on. Deleting evidence out from under a claim is a database error, not a policy.
  3. One inference choke point. Every LLM call in the platform passes through a single central function that records input event IDs as part of the call. A model call that bypasses provenance is not possible by construction. See Observability.
This is deliberately inconvenient. Restricted deletion means cleanup requires walking the chain in dependency order, a price we pay so that the chain can never silently break.

Where you see it

SurfaceWhat you get
DashboardRule detail pages drill down through evidence → inference log → events.
API / SDKclient.rules.provenance(ruleId) returns the entire chain in one call.
MCPThe get_rule_provenance tool gives agents the same drill-down, so your AI can show receipts too.
ChatCitations on every answer resolve to rules, and from rules to the full chain.
const trail = await client.rules.provenance(rule.id);

console.log(trail.rule.description);
console.log(`Produced by ${trail.inferenceLog?.model}`);
console.log(`from ${trail.sourceEvents.length} events`);
console.log(`across ${trail.rawRecords.length} raw records`);
Provenance responses are intentionally large; a well-evidenced rule may carry dozens of events and records. If you only need to gauge evidence strength, count the arrays instead of rendering them.

Common mistakes

SymptomCauseFix
Provenance looks “thin” (1–2 events) for a high-confidence ruleConfidence reflects model certainty, not evidence volumeWeigh confidence, frequency, and evidence count together
Expecting provenance for pattern metricsPattern metrics are SQL aggregates over events, not LLM outputs; they have no inference log by designQuery the events endpoints directly for the underlying data
Trying to delete ingested data that has dependentsRestricted deletion protects the chainDeletion walks the chain top-down; source removals are soft-deletes

Observability

The inference log and tracing that feed the chain.

Knowledge

Rules, libraries, and the review workflow that sits on top.