Provenance - Seyn

LLM-extracted knowledge has a credibility problem: “the model said so” doesn’t survive an audit, a compliance review, or a skeptical executive. Seyn’s answer is structural. Every piece of knowledge carries a complete, database-enforced chain of custody back to the raw records it was derived from. Not as a logging afterthought, but as the core schema design. Similarity is a guess. Provenance is a receipt.

The four hops

Hop	What it tells you
Rule → Inference log	The exact LLM call: model, versioned prompt, token counts, timestamp. Not “an LLM”; that call.
Inference log → Events	The precise set of structured events the model saw as input.
Events → Raw records	The original payloads: the actual Teams/Slack message, email, or document, verbatim.
Raw record → Connector	Which system it came from and when it was ingested.

Provenance follows the source. The chain above is the extraction path; a fact extracted from message content cites the exact message and text span it came from, an interviewed claim cites the transcript turn, and a human edit cites the editor and the instruction they gave. Different sources, same standard: no claim without a citation. So for any rule you can answer, with receipts:

“Why does Seyn think we need CEO approval at $30M?” Here are the 23 deals it observed.
“Show me every message the model looked at when forming this claim.” Here they are, verbatim.
“Which version of which prompt produced this?” This one, on this date, with these token counts.

Enforced, not promised

Three design decisions make the chain trustworthy:

First-class links. Every connection in the chain is a queryable, constrainable relation, not a free-text reference.
Restricted deletion. You cannot delete a raw record that an event depends on, or an event that an inference depends on. Deleting evidence out from under a claim is a database error, not a policy.
One inference choke point. Every LLM call in the platform passes through a single central function that records input event IDs as part of the call. A model call that bypasses provenance is not possible by construction. See Observability.

This is deliberately inconvenient. Restricted deletion means cleanup requires walking the chain in dependency order, a price we pay so that the chain can never silently break.

Where you see it

Surface	What you get
Dashboard	Rule detail pages drill down through evidence → inference log → events.
API / SDK	`client.rules.provenance(ruleId)` returns the entire chain in one call.
MCP	The `get_rule_provenance` tool gives agents the same drill-down, so your AI can show receipts too.
Chat	Citations on every answer resolve to rules, and from rules to the full chain.

const trail = await client.rules.provenance(rule.id);

console.log(trail.rule.description);
console.log(`Produced by ${trail.inferenceLog?.model}`);
console.log(`from ${trail.sourceEvents.length} events`);
console.log(`across ${trail.rawRecords.length} raw records`);

Provenance responses are intentionally large; a well-evidenced rule may carry dozens of events and records. If you only need to gauge evidence strength, count the arrays instead of rendering them.

Common mistakes

Symptom	Cause	Fix
Provenance looks “thin” (1–2 events) for a high-confidence rule	Confidence reflects model certainty, not evidence volume	Weigh `confidence`, `frequency`, and evidence count together
Expecting provenance for pattern metrics	Pattern metrics are SQL aggregates over events, not LLM outputs; they have no inference log by design	Query the events endpoints directly for the underlying data
Trying to delete ingested data that has dependents	Restricted deletion protects the chain	Deletion walks the chain top-down; source removals are soft-deletes

Observability

The inference log and tracing that feed the chain.

Knowledge

Rules, libraries, and the review workflow that sits on top.

​The four hops

​Enforced, not promised

​Where you see it

​Common mistakes

​Related

Observability

Knowledge

The four hops

Enforced, not promised

Where you see it

Common mistakes

Related