Render the HTML audit report¶
The headline differentiator — a self-contained HTML page showing every query's full audit trail.
The simplest path¶
import verifiable_rag
from verifiable_rag.demo import sample_paper_path
verifiable_rag.ask(
"What is the mechanism of action of penicillin?",
docs=sample_paper_path(),
output_html="audit.html",
)
Open audit.html in any browser. That's it.
What's in the report¶
The HTML page shows, top to bottom:
- Header with strictness, refusal status, and overall faithfulness score
- Query in a callout box
- Refusal banner (only when the answer was refused)
- Answer body with each sentence color-coded by verification:
- Default background: supported (verifier passed)
- Red-tinted with dashed underline: unsupported (verifier flagged)
- Citations rendered as superscript
[chunk-id]links — clicking jumps to the source passage card - Faithfulness card row — overall score plus the retrieval / NLI / generation components
- Per-sentence verification table — every sentence with its NLI score and supported/unsupported badge
- Reranked passages — every chunk the generator actually saw, anchored so citations link in
It's a single HTML file with inline CSS, no JavaScript, no external dependencies. Open it on any device with a browser.
From a pre-built Pipeline¶
The output_html= kwarg also works on Answer.to_html():
from pathlib import Path
from verifiable_rag import hybrid_balanced
pipeline = hybrid_balanced()
pipeline.ingest("paper.pdf")
answer = pipeline.ask("What did the authors find?")
Path("audit.html").write_text(answer.to_html())
Custom title:
Serve the report in a web app¶
The HTML is just a string. Drop it into any web framework:
Use it as your debugging surface¶
The audit report is also the fastest way to debug a failing pipeline. When the answer looks wrong:
- Generate the HTML report
- Look at the reranked passages section — did the right passage make it through retrieval?
- Check the per-sentence verification table — which sentences got flagged?
- Look at the citation links — do they point at the right source spans?
Three failure modes show up immediately:
| What you see in the report | Likely cause |
|---|---|
| Right passage missing from "Reranked passages" | Retrieval issue — tune top_k_retrieve, swap embedder, add Contextual Retrieval |
| Right passage present, but generator cited a different one | Generator confused by similar chunks — switch to ConstrainedCitedGenerator |
| All sentences flagged unsupported | Threshold too high, or NLI model isn't trained on your domain — see Calibrate threshold |
Programmatic access to the same data¶
The HTML view is one surface. The same data is on the Answer object for code that needs it programmatically:
answer.sentences # list[CitedSentence]
answer.verification_results # list[VerificationResult]
answer.retrieved_chunks # the reranked passages (same as in the HTML)
answer.unsupported_sentences # filtered: only the flagged ones
answer.cited_sentence_ids # frozenset of all sentence IDs cited
answer.audit_trail() # JSON-serializable dict for logging / metrics
See Observability for production logging patterns.
Customizing the report¶
The HTML report is rendered by verifiable_rag.report.to_html, which is a pure function. For substantial customization, copy it into your project and modify — there are no plugins or callbacks.
Common tweaks:
- Custom CSS — edit the
_CSSconstant inreport.py - Hide sections — wrap the section-rendering call in a feature flag
- Add your branding — modify the
<header>block into_html() - Multi-language — replace the labels in the section renderers (they're plain strings)
If you do something interesting, open a PR — we'd like to hear the use case.