Query - Seyn

Every question asked of Seyn, whether in chat, through the API, or via MCP, runs through one query pipeline. Its design principle: no single search signal is trustworthy alone. Semantic similarity misses exact terms (“error 4032”, “Acme Corp”); keyword search misses paraphrase; structured filters miss everything they weren’t told about; and none of them know whether a rule actually holds in practice. Seyn runs them all and lets them vote. This is the part to be precise about: Seyn is not a vector database with a chat wrapper. Vector similarity is one signal among up to five, fused by rank, re-scored by a cross-encoder, and expanded through the entity and knowledge graph, over knowledge that was extracted, structured, and human-reviewed before any query ever touched it.

The pipeline

Step	What it does	Why it’s there
Classify	A heuristic (deliberately not an LLM) picks a strategy and extracts filters	Classification runs on every query; cheap and deterministic beats smart and slow here
Expand	The query is reformulated into variants, each embedded	One phrasing of a question shouldn’t decide what’s findable
Structured	SQL conditions on rule fields	Exact filters: process, step, confidence, review status
Full-text	BM25 ranking	Catches literal tokens semantics miss: IDs, names, error codes
Literal semantic	Vector similarity against what each claim says	Catches paraphrase keyword search misses
Inferred semantic	Vector similarity against what each claim implies	”Who can sign this?” should find an approval rule that never uses the word “sign”
Outcome signal	Boosts candidates whose rules observably hold in live events	Knowledge that’s followed in practice outranks knowledge that’s merely written down
Rank fusion	Merges the rank streams by position, not by score	Scores from different signals aren’t comparable; ranks are
Rerank	A cross-encoder re-scores query↔result pairs	Fusion gets the right candidates into the pool; reranking gets the best ones to the top
Graph + parent expansion	Walks related knowledge and attaches surrounding process context	A rule without its process context is hard to interpret; the related rule one hop away is often the actual answer

The signal set is adaptive: simple lookups don’t pay for expansion and inference, broad questions do. Everything runs inside one PostgreSQL instance: vectors, full-text, and relational filters in the same database as the source of truth. No search-index sync drift, and it’s a large part of why retrieval stays sub-second.

Degrades, never hard-fails

Querying degrades gracefully when optional dependencies are missing:

Missing	Behaviour
Embedding provider	Semantic signals contribute nothing; full-text and structured signals still answer
Reranker	A passthrough preserves fusion order
Outcome data	The outcome signal contributes nothing until rules have observations

A degraded answer beats an error page, and the explain output tells you when you’re getting one.

Choosing a strategy

The classifier picks automatically (auto), but API callers can force one:

Strategy	Use when	Tradeoff
`structured`	The query maps to known fields (“rules in the loan process with confidence above 0.8”)	Precise, but blind to semantics
`hybrid`	Natural-language questions. The default workhorse	Balanced; costs a rerank call
`graph`	Exploring connections (“what’s related to this approval rule?”)	Surfaces structure; weaker for direct Q&A

Explain mode

Pass explain=true (or use the dashboard’s Query Explorer) and every result carries its attribution: which signals matched it, at what ranks, and what fusion and reranking did to its position. When someone asks “why did chat say that?”, the explain output is the answer. It’s the query-side analogue of the provenance chain.

Tuning heuristics

Start with auto. Force hybrid only when the classifier visibly picks wrong; check explain output first.
Queries with literal tokens (IDs, error codes, project names) lean on full-text matching. If such queries underperform, confirm the term actually appears in rule text rather than only in raw records.
topK defaults are conservative. Raise toward 20–50 for synthesis-style consumers (an agent summarising a topic); keep low for direct Q&A.
Filter by reviewStatus for production integrations. Querying ranks by relevance, not by whether a human approved the rule.

Common mistakes

Symptom	Cause	Fix
A rule you can see in the dashboard never surfaces	It’s in a `draft` or `archived` library, or its indexes postdate your query	Check the active library version; re-run extraction to refresh indexes
Exact ID search returns fuzzy results	Semantic signals dominate short queries	Use the `structured` strategy or a field filter for lookups
Scores look “low” across the board	Scores are relative ranking values, not absolute confidence; the top hit is the best available, not “90% certain”	Compare within a result set, never across queries
Chat answers ignore obviously relevant knowledge	Chat queries before generating; if the query misses, generation can’t recover	Debug with Query Explorer’s explain mode, not by re-phrasing chat prompts

Chat

The biggest consumer of this pipeline.

Knowledge

The substrate and indexes this pipeline searches.

​The pipeline

​Degrades, never hard-fails

​Choosing a strategy

​Explain mode

​Tuning heuristics

​Common mistakes

​Related