Presets¶
presets
¶
Preset pipeline factories — sensible component combos for common use cases.
Each factory returns a fully-wired :class:Pipeline. Imports are lazy
inside each function so the presets module itself doesn't pull in
torch / transformers / cohere unless a preset that uses them is actually
called — keeping import verifiable_rag fast.
When to use which preset¶
+---------------------+---------------------------------------------+-------------+
| Preset | Use case | API keys |
+=====================+=============================================+=============+
| local_minimal | Hobbyist / academic, no verification | generator |
+---------------------+---------------------------------------------+-------------+
| local_verified | Local everything + HHEM NLI verification | generator |
+---------------------+---------------------------------------------+-------------+
| hybrid_balanced | Recommended. Cohere retrieval + Dual NLI| generator + |
| (RECOMMENDED) | + constrained Haiku generator | Cohere |
+---------------------+---------------------------------------------+-------------+
| hybrid_strict | Same as balanced, refuses on borderline | generator + |
| | faithfulness | Cohere |
+---------------------+---------------------------------------------+-------------+
| hybrid_paranoid | Sonnet generation, Dual NLI, hard refusal | generator + |
| | unless extremely confident | Cohere |
+---------------------+---------------------------------------------+-------------+
| llm_judge_ | LLM-as-judge verifier (Sonnet 4.6 default). | generator + |
| verified | Strictest single-model verifier; offline | Cohere |
| | "ceiling" reference. ~250x cost per call. | |
+---------------------+---------------------------------------------+-------------+
For full customization, use :func:build_pipeline with explicit knobs,
or load a YAML config via Pipeline.from_yaml(path).
Every preset accepts index_dir so the user can keep separate indexes
per experiment / per corpus without rebuilding.
hybrid_balanced
¶
hybrid_balanced(*, generator_model: str = _DEFAULT_HAIKU, index_dir: Path = _DEFAULT_INDEX_DIR) -> 'Pipeline'
Recommended default. Cohere retrieval + Dual NLI + constrained Haiku.
- Parser: Docling primary, PyMuPDF fallback (caching wrapper)
- Chunker: ParentChildChunker
- Embedder: Cohere embed-english-v3.0 (hosted)
- Reranker: Cohere rerank-v3 (hosted)
- Generator: ConstrainedCitedGenerator (schema-forced cites, ReClaim-style)
- Verifier: DualNLIVerifier(HHEM + MiniCheck), min aggregation
- Strictness:
"balanced"
Matches the published verifiable-rag baseline that hits 0.875 mc_acc on LitQA2 and 0.844 AUROC on RAGTruth. The defaults that "just work."
Requires ANTHROPIC_API_KEY and COHERE_API_KEY.
Source code in src/verifiable_rag/presets.py
hybrid_strict
¶
hybrid_strict(*, generator_model: str = _DEFAULT_HAIKU, index_dir: Path = _DEFAULT_INDEX_DIR) -> 'Pipeline'
Same components as :func:hybrid_balanced, but strictness=strict.
Refuses any answer where the Dual NLI faithfulness score is below the strict threshold (0.7). Higher refusal rate; only confident answers slip through.
Source code in src/verifiable_rag/presets.py
hybrid_paranoid
¶
hybrid_paranoid(*, generator_model: str = _DEFAULT_SONNET, index_dir: Path = _DEFAULT_INDEX_DIR) -> 'Pipeline'
Maximum-strictness preset: Sonnet generation + Dual NLI + paranoid threshold.
Components identical to :func:hybrid_balanced except:
* Generator: Sonnet 4.6 (stronger structured-output behavior)
* Strictness: "paranoid" (refuse below faithfulness 0.9)
Use for high-trust use cases (legal, medical, scientific verification). Cost per query is ~5x balanced because of the Sonnet swap.
Source code in src/verifiable_rag/presets.py
local_minimal
¶
local_minimal(*, generator_model: str = _DEFAULT_HAIKU, index_dir: Path = _DEFAULT_INDEX_DIR) -> 'Pipeline'
All-local pipeline, no verifier. Cheapest path to "it works."
- Parser: PyMuPDF (fast, text-only — no layout fidelity, no OCR)
- Chunker: ParentChildChunker (default sizing)
- Embedder: BGE-small-en-v1.5 (local, 384-dim)
- Index: HybridIndex (LanceDB + BM25)
- Reranker: none
- Generator: Prompted (works with any LiteLLM model)
- Verifier: none, strictness=
"loose"
Generator still hits an LLM API (Haiku 4.5 by default), so you need
ANTHROPIC_API_KEY. For fully air-gapped, swap generator_model
to a local Ollama / vLLM endpoint via LiteLLM.
Source code in src/verifiable_rag/presets.py
local_verified
¶
local_verified(*, generator_model: str = _DEFAULT_HAIKU, index_dir: Path = _DEFAULT_INDEX_DIR, strictness: 'Strictness' = 'balanced') -> 'Pipeline'
All-local + HHEM NLI verification (~600M, downloads on first use).
Same retrieval stack as :func:local_minimal plus:
* Reranker: BGE rerank-v2-m3 (local, ~568M)
* Verifier: HHEM-2.1-open with default threshold
Good middle ground: NLI verification with no API costs beyond the generator LLM.
Source code in src/verifiable_rag/presets.py
build_pipeline
¶
build_pipeline(*, retrieval: _RetrievalTier = 'hybrid', verifier: _VerifierName = 'dual_nli', generator: _GeneratorName = 'constrained', strictness: 'Strictness' = 'balanced', generator_model: str = _DEFAULT_HAIKU, index_dir: Path = _DEFAULT_INDEX_DIR, top_k_retrieve: int = 100, top_k_rerank: int = 10) -> 'Pipeline'
Parametric pipeline factory — pick your axes, get a wired Pipeline.
Use this when you want to mix and match outside the named presets. For example::
pipeline = build_pipeline(
retrieval="local",
verifier="dual_nli",
generator="constrained",
strictness="strict",
)
More custom needs (e.g. ContextualChunker, swap in your own
Embedder) → instantiate :class:Pipeline directly, or load from
YAML via :meth:Pipeline.from_yaml.
Source code in src/verifiable_rag/presets.py
300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 | |