Knowledge - Seyn

Knowledge is what Seyn exists to produce. Its design follows one principle: extracted knowledge is a claim, not a fact. Every claim carries the machinery to be questioned: evidence, confidence, review state, and history. Everything Seyn knows fits into three layers: a substrate that records claims (assertions), projections that make them readable (rules, libraries), and indexes that make them findable (embeddings, full-text, the entity graph).

Assertions: the unit of knowledge

The atom of Seyn’s knowledge is an assertion: one claim, recorded append-only, with a precise shape. Every assertion is a subject (a process, a step, an actor, a tool, an entity) plus a typed predicate about it. Not prose; structure: “about the contract step of the deal pipeline, we claim: deals above $30M require CEO approval.” Assertions are never edited. When understanding changes, a new assertion supersedes the old one, and both timelines are kept: when something was true in the business, and when Seyn came to believe it. That’s what makes questions like “what did we know on March 1st, and why did it change?” answerable instead of archaeological. Three properties do the heavy lifting:

Typed claims. The predicate is a typed statement: a policy, an SLA, a step ordering, a role, a tool usage, or a workaround (including whether it’s formal, documented, or tribal). The type determines what can be checked against live events later.
Source attribution, with reasoning. Every assertion records who asserted it (the extraction pipeline, a structured interview, a human edit, or an import) and why: extracted claims carry observed frequency and counter-example counts; interviewed claims carry how confident the interviewee sounded. Knowledge observed in the data and knowledge contributed by people are first-class peers, distinguishable forever.
Two timelines. Validity intervals plus supersession links mean the current view is just one cut through the history, not the only one.

Claims enter the substrate from more than synthesis: facts extracted directly from message and document content cite the exact text span they came from, and human corrections flow through knowledge edits, an audited propose-then-apply log that records the instruction, the proposed mutation, and exactly which assertions it touched. Even correcting Seyn leaves a receipt. You’ll mostly meet assertions indirectly: rules, libraries, chat answers, and query results are all views over the assertion substrate. The substrate is why those views can be versioned, reverted, and audited.

Processes and rules

A process is a named sequence of steps an entity flows through (deal-pipeline, loan-approval). A process rule is the readable unit you work with day to day: one statement, pinned to a process and optionally a step, projected from the underlying assertions. The full field reference is on Core Concepts.

Conditions: three kinds of logic

Each rule carries a structured condition, because not all business logic is the same shape:

Kind	What it expresses	Example
`deterministic`	Mechanically checkable logic	`deal.amount > 30_000_000`
`llm`	Judgment calls that need a model to evaluate	”the communication suggests finance has concerns”
`compound`	Boolean combinations of the above	deterministic threshold OR llm-evaluated flag

The distinction matters downstream: deterministic conditions can be checked against live events mechanically, which is what powers outcome tracking; llm conditions are flagged as requiring evaluation.

The review workflow

Every freshly extracted rule is inferred: the model produced it, no human has looked. Reviewers move rules to confirmed, modified (edited then accepted), or rejected.

Rejected rules are kept, not deleted. A rejection is information. It tunes what the org considers noise, and because the substrate is append-only, the audit trail of what the model claimed and a human overruled survives.

Libraries, versions, and time travel

A library is a versioned view of everything Seyn knows about your organisation. Each extraction run produces a new version; versions are monotonically increasing and old versions stay accessible. Libraries are draft, active, or archived; queries read the active one. Because libraries are projections over an append-only substrate, versioning goes further than snapshots:

As-of queries. Read the library as it stood at any timestamp.
Named versions. Tag a known-good state (“post-onboarding-v1”) and refer to it later.
Revert. Roll the active view back to any tagged state, atomically, without losing the history of what came after.

Last quarter’s “

20M needs approval" becoming this quarter's "

30M” is a feature, and the history of that change is part of the knowledge.

How knowledge is extracted

Extraction is structured as four staged LLM passes, each narrowing the problem for the next. Asking one model call to “find all the processes in 50,000 events” produces fiction; asking it to label one cluster of related events produces evidence.

Stage	Question it answers	Model tier
Cluster	Which events belong together?	Fast
Sequence	In what order do things happen?	Fast
Exceptions	When does reality deviate from the pattern?	Fast
Synthesis	What are the processes and rules?	Frontier

The economics are deliberate: the three high-volume stages run on a fast model tier, and only synthesis, the stage whose words people actually read, gets the frontier model. Prompts are versioned, every call is centrally logged with its input events (that’s the provenance link), and every run is fully traced. See Observability. Real corpora are bigger than any context window, so extraction assumes scale: batched clustering with bounded concurrency, content-hash deduplication before the model ever sees repeats, prompt caching that cuts repeat-batch cost to roughly a tenth, checkpointed stages that resume instead of restarting, and per-org concurrency caps so one tenant’s run can’t starve another’s. Cost scales with new events, not corpus size. Re-running extraction on the same evidence updates assertions instead of duplicating them.

What extraction doesn’t claim

It extracts what the data shows. Knowledge that lives only in people’s heads enters through the other assertion sources: interviews and human edits, recorded as such.
Fresh rules are inferred, not endorsed. Every extracted rule enters the human review workflow.
Confidence is a heuristic. Pair it with frequency and review status before acting on a rule.

The query-facing shape

When knowledge is written, it’s indexed four ways so querying can find it from different directions:

a literal embedding of what the claim says,
an inferred embedding of what the claim implies, so queries match meaning, not just wording,
a full-text vector for exact-term matching,
the entity graph: people, tools, and entities linked to the assertions that mention them, for “what’s connected to this?” traversal.

Query

How the four indexes get searched and fused.

Provenance

How every claim stays auditable.

​Assertions: the unit of knowledge

​Processes and rules

​Conditions: three kinds of logic

​The review workflow

​Libraries, versions, and time travel

​How knowledge is extracted

​What extraction doesn’t claim

​The query-facing shape

​Related

Query

Provenance

Assertions: the unit of knowledge

Processes and rules

Conditions: three kinds of logic

The review workflow

Libraries, versions, and time travel

How knowledge is extracted

What extraction doesn’t claim

The query-facing shape

Related