Skip to main content
Seyn separates write-time work from query-time work. Ingestion and analysis run asynchronously in the background and can take minutes; querying is synchronous and reads only what’s already been extracted and indexed. Almost every “why don’t I see X?” question resolves to knowing which side of that line X is on.

The five layers

LayerWhat it doesWhere it’s documented
ConnectorsPull raw activity from client systems via delta sync, API polling, or upload. Read-only, always.Connectors
EventsTransform heterogeneous payloads into one event schema; resolve actors to canonical identities.Events
ExtractionStaged LLM analysis over events; produces a new knowledge library version per run.Knowledge → Extraction
KnowledgeAn append-only assertion substrate, projected into versioned rules and libraries, indexed four ways.Knowledge
QueryingHybrid retrieval serving chat, dashboard, MCP, and the public API.Query
Two cross-cutting systems run through all five layers: the provenance chain (every layer records where its outputs came from) and multi-tenancy (every row in every layer is organisation-scoped).

The write path

  1. A connector sync lands deduplicated raw records. Each sync produces a run record with per-stage counters you can watch in the dashboard.
  2. Normalizers fan raw records out into events (who did what to which entity, when) and resolve actor identities across systems.
  3. An extraction run executes the staged LLM analysis and writes a new knowledge library version, logging every model call as it goes.
  4. Indexes are generated for the new knowledge: a semantic embedding, a full-text vector, parent context references, and graph edges between related rules.
Each step is a separate durable background job with its own retry and checkpoint behaviour, so one failing stage doesn’t silently corrupt the layers below it.

The read path

A query, whether it comes from chat, the dashboard, MCP, or the API, hits the query pipeline: three retrieval signals run concurrently (structured, full-text, semantic), get fused, reranked, and optionally expanded through the knowledge graph and parent context. The read path never calls back into the write path. If knowledge isn’t in the active library version, no amount of querying will surface it; you need an extraction run.

Technology

ConcernChoiceWhy
DatabasePostgreSQL 16 + pgvectorOne database for relational data, vectors, and full-text search. No sync drift between a search index and the source of truth.
LLMClaude (Anthropic)Fast models for high-volume stages, frontier model for synthesis. Every call goes through one central, logged function.
RerankingCohere Rerank 3.5With a passthrough fallback: querying degrades gracefully, never hard-fails.
JobsDurable background tasksPer-org concurrency caps, checkpointed stages, cron watchdogs.
AuthClerkOrganisations map 1:1 to tenants; roles come from signed claims, never from request parameters.
TracingSelf-hosted LangfuseFull LLM traces without prompt data leaving Seyn infrastructure.

When something looks wrong, walk the chain in order

1

Sync first

Did the connector sync actually complete? Check the sync-run status and counters on the connector detail page. A sync that’s running or failed means the data never arrived.
2

Extraction second

Has an extraction run completed since that sync? New events don’t affect answers until a run produces a new library version.
3

Library third

Is the library version you expect actually active? Queries read the active version, not drafts.
4

Query last

Use explain mode to see exactly which signals matched and how results were ranked. If a rule exists but doesn’t surface, this shows you why.
This order catches almost every “I connected a source but chat doesn’t know about it” debugging session. It’s nearly always step 1 or 2, not querying.

Events

The common schema everything is analysed in.

Query

The read path in full detail.