Research Blog · Reference

The Nuance Glossary

BDS surfaces a lot of small signals — channels, similarity scores, critic verdicts, citation marks. Each one means something specific, and a few of them are easy to misread. This page defines every term our research notes use, once, in plain language.

The signals tell you where to look. You decide what they mean.

Reading The Search Results Table

Channel

Which retriever(s) found a result. It is a confidence signal, not a rank — a both row can sit below a bm25 row. Read it as “how many independent methods agreed,” not “how high this will appear.”

ValueMeaningTrust
bothFound by both vector (semantic) and BM25 (keyword).Strongest — the chunk reads like the query and contains its terms.
vectorSemantic-only — reads like the query even when the exact words differ.Strong for meaning-level match.
bm25Keyword-only — shares query terms but no semantic agreement.Weakest — verify the snippet actually relates to your question.

Similarity %

Context-dependent by row. For vector / both rows it's real cosine similarity (100% = identical meaning). For bm25 rows it's a keyword score on an absolute scale.

Why it matters: a 100% bm25 row can be a school-bus chunk for an aviation query. High keyword score ≠ addresses your question.

RAG column (✓ / · )

Whether the synthesis model used a chunk. = cited in the answer. · = seen but not cited. Blank = RAG not run yet.

Why it matters: a high-rank chunk marked · means retrieval ranked it but the model found it useless; a low-rank chunk marked means the real answer was there. Retrieval rank ≠ answer utility.

Quality

A cosmetic 0–5 scale derived from Similarity %. Per-row it duplicates the Similarity column.

Why it matters: it doesn't add information. Safe to ignore.

Domain

Per-chunk classification from ingestion: legal, technical, financial, academic, business, medical, or general.

Why it matters: use the Domain filter to narrow a noisy result set. “General” means “no strong specialized match,” not “unclassified.”

Trust Signals

Result Critic

A heuristic engine — no LLM call — runs after every search and raises an orange banner when the retrieval shape looks untrustworthy. It is conservative: it speaks only when a pattern is unambiguous. No banner doesn't mean “perfect”; a banner means “read carefully.”

VerdictTriggerWhat it means
WEAKTop match below 65%, or all results keyword-only.The model may still answer — verify against the citations.
NARROWAll results come from a single file.Limited corpus breadth for this question.
CORPUS_GAPEmpty results, or a genuinely weak top result.The model likely can't answer; automatic synthesis is skipped.
How Retrieval Works

The machinery behind the columns.

BDS retrieves with two methods at once — vector (semantic) and BM25 (keyword) — and fuses their ranked lists with Reciprocal Rank Fusion: a chunk's score is 1/(60+vector_rank) + 1/(60+bm25_rank).

Why it matters: neither method alone is sufficient. Keyword search misses paraphrase; semantic search misses exact identifiers. Fusing them is why the Channel column exists.

Cross-encoder re-ranker

A second-stage model (default-on) re-orders the fused pool by scoring each (query, chunk) pair jointly — by how well the chunk answers the question.

Why it matters: this is why a high-similarity or both chunk can rank below one that simply answers better. Rank, Channel, and Similarity are three different axes.

BM25

The keyword-ranking algorithm behind the bm25 channel. It scores how well a chunk's words overlap the query, weighting rare terms more heavily.

Why it matters: it does most of the heavy lifting in keyword-rich corpora (filings, reports) — but keyword overlap is not comprehension.

Vector / embedding

A numeric representation of a chunk's meaning, produced by an embedding model (BDS uses BGE-large). Two passages with similar meaning sit close together regardless of wording.

Why it matters: embeddings are what let a search match “anti-stall system” to text that only says “MCAS.” The vector and both channels come from here.

Entities & The Vault

From documents to a knowledge graph.

Entity card

A generated profile of a person, company, agency, product, or other entity found across the corpus — a summary, key facts, mention count, source list, and wikilinks to related entities. Together the cards form an Obsidian-compatible vault.

Why it matters: cards are grounded in the source documents (no fabrication), but an entity's type or boundaries can be imperfect for thinly-mentioned entities.

Subject anchor

An investigation target you name before indexing (e.g. {"name":"737 MAX","type":"PRODUCT"}). Anchors are protected from dedup so the central subject of a case can never silently vanish.

Why it matters: a configured anchor with under 50 mentions uses your declared type (too few mentions for reliable inference). A configured anchor that isn't found becomes a visible 0-mention placeholder, not a silent absence.

A card frontmatter field listing other subject anchors that share the same investigation — entities that co-occur, not aliases.

Why it matters: Boeing's card listing FAA and NTSB under related_anchors means they appear together in the case, not that they're the same entity.

The share of [[wikilinks]] between cards that resolve to a real target after the quality pass. A QA stage repairs links that point at near-miss or surface-variant names.

Why it matters: it's our single best structural-integrity metric for a vault — high repair means the entity-to-card mapping is consistent and the graph is navigable.

Synthesis (RAG)

The answer panel, and how to read it.

RAG / synthesized answer

Retrieval-Augmented Generation: the model writes a prose answer from the retrieved chunks only, citing the ones it used (the ✓ marks).

Why it matters: RAG is convenience, not authority. The prose can sound confident while drawing on weak context — always read the source citations.

Auto Synth

A default-on checkbox that runs RAG automatically after each search. It auto-skips when the Result Critic returns CORPUS_GAP — the model couldn't ground an answer anyway.

Why it matters: uncheck it for fast exploratory scanning when you only want the result table.

Language drift

A known limitation of the Qwen synthesis model: on corpora with multilingual content it can switch into another language (usually Chinese) mid-answer. It is stochastic — the same query drifts on one run and not the next — not a function of corpus size. (The behavior was characterized on the earlier 7B configuration; the production text model is now a larger 14B, with the same guards in place.)

Why it matters: BDS instructs the model to answer in English and runs a post-generation language check that regenerates the answer if drift is detected, so it rarely reaches you. When it does, the facts are usually still correct — a presentation glitch, not a content error.

Corpus-shape trap

The defining property of retrieval: BDS answers from what the corpus contains, not from what the answer should be. When your question isn't covered, search still returns its closest chunks — which can look relevant and aren't.

Why it matters: it's the most common reason an analyst loses trust in a system that's working correctly. Read the full lab note →

One Page, Every Term

Bookmark this. The research notes link back to it.

Every lab note on this blog uses these terms in their precise sense. When a post says a result is all bm25, or that the Critic fired CORPUS_GAP, this is what it means — and why it's worth reading carefully.