
About
>
agentskills.io compliant frontmatter
name: clarity-gate risk: unknown source: community version: 2.1.3 description: > Pre-ingestion verification for epistemic quality in RAG systems. Ensures documents are properly qualified before entering knowledge bases. Produces CGD (Clarity-Gated Documents) and validates SOT (Source of Truth) files. author: Francesco Marinoni Moretto license: CC-BY-4.0 repository: https://github.com/frmoretto/clarity-gate triggers:
- clarity gate
- check for hallucination risks
- can an LLM read this safely
- review for equivocation
- verify document clarity
- pre-ingestion check
- cgd verify
- sot verify capabilities:
- document-verification
- epistemic-quality
- rag-preparation
- cgd-generation
- sot-validation outputs:
- type: cgd extension: .cgd.md spec: docs/CLARITY_GATE_FORMAT_SPEC.md spec_version: "2.1"
Clarity Gate v2.1
Purpose: Pre-ingestion verification system that enforces epistemic quality before documents enter RAG knowledge bases. Produces Clarity-Gated Documents (CGD) compliant with the Clarity Gate Format Specification v2.1.
Core Question: "If another LLM reads this document, will it mistake assumptions for facts?"
Core Principle: "Detection finds what is; enforcement ensures what should be. In practice: find the missing uncertainty markers before they become confident hallucinations."
What's New in v2.1
| Feature | Description |
|---------|-------------|
| Claim Completion Status | PENDING/VERIFIED determined by field presence (no explicit status field) |
| Source Field Semantics | Actionable source (PENDING) vs. what-was-found (VERIFIED) |
| Claim ID Format Guidance | Hash-based IDs preferred, collision analysis for scale |
| Body Structure Requirements | HITL Verification Record section mandatory when claims exist |
| New Validation Codes | E-ST10, W-ST11, W-HC01, W-HC02, E-SC06 (FORMAT_SPEC); E-TB01-07 (SOT validation) |
| Bundled Scripts | claim_id.py and document_hash.py for deterministic computations |
Specifications
This skill implements and references:
| Specification | Version | Location | |---------------|---------|----------| | Clarity Gate Format (Unified) | v2.1 | docs/CLARITY_GATE_FORMAT_SPEC.md |
Note: v2.0 unifies CGD and SOT into a single .cgd.md format. SOT is now a CGD with an optional tier: block.
Validation Codes
Clarity Gate defines validation codes for structural and semantic checks per FORMAT_SPEC v2.1:
HITL Claim Validation (§1.3.2-1.3.3)
| Code | Check | Severity |
|------|-------|----------|
| W-HC01 | Partial confirmed-by/confirmed-date fields | WARNING |
| W-HC02 | Vague source (e.g., "industry reports", "TBD") | WARNING |
| E-SC06 | Schema error in hitl-claims structure | ERROR |
Body Structure (§1.2.1)
| Code | Check | Severity |
|------|-------|----------|
| E-ST10 | Missing ## HITL Verification Record when claims exist | ERROR |
| W-ST11 | Table rows don't match hitl-claims count | WARNING |
SOT Table Validation (§3.1)
| Code | Check | Severity |
|------|-------|----------|
| E-TB01 | No ## Verified Claims section | ERROR |
| E-TB02 | Table has no data rows | ERROR |
| E-TB03 | Required columns missing | ERROR |
| E-TB04 | Column order wrong | ERROR |
| E-TB05 | Empty cell in required column | ERROR |
| E-TB06 | Invalid date format in Verified column | ERROR |
| E-TB07 | Verified date in future (beyond 24h grace) | ERROR |
Note: Additional validation codes may be defined in RFC-001 (clarification document) but are not part of the normative FORMAT_SPEC.
Bundled Scripts
This skill includes Python scripts for deterministic computations per FORMAT_SPEC.
scripts/claim_id.py
Computes stable, hash-based claim IDs for HITL tracking (per §1.3.4).
# Generate claim ID
python scripts/claim_id.py "Base price is $99/mo" "api-pricing/1"
# Output: claim-75fb137a
# Run test vectors
python scripts/claim_id.py --test
Algorithm:
- Normalize text (strip + collapse whitespace)
- Concatenate with location using pipe delimiter
- SHA-256 hash, take first 8 hex chars
- Prefix with "claim-"
Test vectors:
claim_id("Base price is $99/mo", "api-pricing/1")→claim-75fb137aclaim_id("The API supports GraphQL", "features/1")→claim-eb357742
scripts/document_hash.py
Computes document SHA-256 hash per FORMAT_SPEC §2.2-2.4 with full canonicalization.
# Compute hash
python scripts/document_hash.py my-doc.cgd.md
# Output: 7d865e959b2466918c9863afca942d0fb89d7c9ac0c99bafc3749504ded97730
# Verify existing hash
python scripts/document_hash.py --verify my-doc.cgd.md
# Output: PASS: Hash verified: 7d865e...
# Run normalization tests
python scripts/document_hash.py --test
Algorithm (per §2.2-2.4):
- Extract content between opening
---\nand<!-- CLARITY_GATE_END --> - Remove `