Pipeline¶
Pipeline
dataclass
¶
Pipeline(parser: Parser, chunker: Chunker, embedder: Embedder, indexer: HybridIndex, generator: Generator, reranker: Reranker | None = None, verifier: Verifier | None = None, strictness: Strictness = 'balanced', top_k_retrieve: int = 20, top_k_rerank: int = 5)
Orchestrates the full verifiable RAG pipeline.
Swap any component by passing a different implementation of the corresponding Protocol. All components are required; there are no silent no-ops.
from_yaml
classmethod
¶
Build a Pipeline from a YAML config. See verifiable_rag.config for schema.
Source code in src/verifiable_rag/pipeline.py
ingest
¶
ingest(path: str | Path) -> Document
Parse, chunk, embed, and index a document.
Returns the parsed Document so callers can inspect sentence IDs.
Source code in src/verifiable_rag/pipeline.py
prepare_ingest
¶
prepare_ingest(path: str | Path) -> tuple[Document, list[Any], list[list[float]]]
Parse, chunk, and embed path without touching the shared index.
Split out from ingest() so the slow work (parse + embed) can run
concurrently across threads while commit_ingest() is serialised
behind a lock to keep the index consistent.
Source code in src/verifiable_rag/pipeline.py
commit_ingest
¶
commit_ingest(document: Document, chunks: list[Any], embeddings: list[list[float]]) -> None
Atomically write a prepared ingest to the shared index + doc map.
Source code in src/verifiable_rag/pipeline.py
ask
¶
ask(query: str) -> Answer
Run the full pipeline for query and return a verified Answer.