Strictness & refusal¶
How the library decides when to return a partial answer, a clean answer, or a refusal.
The strictness slider¶
Pipeline.strictness is a four-step slider that maps to a faithfulness threshold:
| Strictness | Threshold | Behavior |
|---|---|---|
loose |
0.0 | Never refuse. Verifier output is informational only. |
balanced ⭐ |
0.5 | Refuse if faithfulness score < 0.5 after surgical correction. Default. |
strict |
0.7 | Refuse if faithfulness score < 0.7. Only confident answers slip through. |
paranoid |
0.9 | Refuse if faithfulness score < 0.9. High refusal rate; for high-trust use cases. |
You can pass the strictness either to a preset (hybrid_strict()) or directly to the Pipeline:
from verifiable_rag import hybrid_balanced
pipeline = hybrid_balanced()
pipeline.strictness = "strict"
Or wire it from scratch:
What "faithfulness score" actually is¶
After verification runs, the Pipeline computes a single scalar faithfulness_score ∈ [0, 1] that summarizes how trustworthy the answer is. It's a blend of three signals:
faithfulness_score = combine(
retrieval_score, # avg retrieval / rerank score across used chunks
nli_score, # avg NLI score across cited sentences
generation_logprob, # generator-side confidence (when available)
)
The exact combination is implementation-controlled; the components are also exposed on answer.faithfulness_components for users who want to look at them separately.
Surgical correction vs. hard refusal¶
When the verifier flags sentences as unsupported, the Pipeline has two options:
- Surgical correction (default in
balanced+): drop just the flagged sentences from the answer, return the rest. The dropped texts land onanswer.unsupported_claims. - Hard refusal: drop everything and return an empty answer with
answer.was_refused = Trueand arefusal_reason.
The decision is based on the post-correction faithfulness score:
verify → flag unsupported sentences
→ surgical correction (drop flagged sentences)
→ recompute faithfulness
→ IF faithfulness < strictness_threshold:
hard refusal
ELSE:
return surgically corrected answer
What an answer looks like in each mode¶
Same query, same document, four strictness levels. The query is "Did the authors prove a causal mechanism?" — a hard question where the document only provides correlational evidence.
pipeline = hybrid_balanced()
answer = pipeline.ask(query)
# answer.text = "The authors observed a correlation between..."
# answer.was_refused = False
# answer.unsupported_claims = ["The authors demonstrated a causal link"]
# Verifier flagged the overreaching sentence; surgical correction kept the rest.
Why this matters¶
Most "chat with your documents" products use prompt-conditioned refusal: they tell the LLM "if you don't know, say you don't know." This fails for the same reason every prompt-conditioned safety measure fails — the model is sometimes confidently wrong about what it knows.
The library's strictness slider is architectural: the refusal decision is made after the verifier has actually checked the cites against the source. The model can't talk its way past a faithfulness threshold; the threshold is computed from objective NLI scores.
This is also why strict and paranoid modes require a verifier:
A
strict-mode answer that didn't actually run verification is a bug, not a feature.
If you wire verifier=None and set strictness="strict", the Pipeline raises. The library refuses to claim it verified something it didn't.
Programmatic detection¶
answer = pipeline.ask(query)
if answer.was_refused:
log.warning(f"refused: {answer.refusal_reason}")
# Optionally: retry with a looser strictness, or surface to user
return None
if answer.unsupported_claims:
log.info(f"answer included corrections; dropped: {answer.unsupported_claims}")
return answer.text
For aggregate observability (refusal rate, average faithfulness, etc.), every answer has answer.audit_trail() — a JSON-serializable dict ready for emission to your metrics stack. See Observability.
Choosing the right strictness¶
| If your use case is... | Pick... |
|---|---|
| Internal chatbot, low-stakes Q&A | loose or balanced |
| User-facing product, citation matters | balanced (default) |
| Legal / medical / scientific verification | strict |
| Compliance reporting, audit-grade | paranoid |
If you find yourself getting too many refusals in strict or paranoid, the answer is usually one of:
- Improve retrieval — most refusals are caused by the right passage not making it to the generator. Tune
top_k_retrieveandtop_k_rerankupward. - Recalibrate the verifier on your domain — the default threshold (0.0562 for Dual NLI) was fit on RAGTruth. Your domain may need different. See Calibrate threshold.
- Switch from
paranoidtostrict— the 0.9 threshold is intentionally aggressive; only use it when truly necessary.