Session

From Confident Falsehoods to Honest Abstention: Pre-Ingestion Verification for RAG Pipelines

RAG systems inherit the epistemic quality of their source documents. If a document presents assumptions as facts or omits uncertainty markers, downstream LLMs will confidently reproduce these errors regardless of retrieval accuracy or prompt engineering.

This talk presents Clarity Gate, an open-source pre-ingestion verification protocol that checks documents for epistemic quality before they enter RAG knowledge bases. The protocol applies 9 verification points producing intermediate documents with inline uncertainty metadata.

We benchmarked them across 6 LLM models using a document with 39 deliberate epistemic traps. Top-tier models achieved 100% detection with or without annotations. Mid-tier models improved measurably: Gemini Flash 75%->100% (+25%), GPT-5 Mini 81%->100% (+19%).

Honest confound: system prompt instructions alone achieved similar results on failing models. However, annotations persist across sessions, work without downstream prompt control, and scale across users.

Live on GitHub with unified .cgd.md specification. Relevant to anyone building RAG pipelines needing auditability, epistemic traceability, or EU AI Act compliance.

github.com/frmoretto/clarity-gate

Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.

Jump to top