Loading...
Loading...
Use when validating golden dataset quality. Runs schema checks, duplicate detection, and coverage analysis to ensure dataset integrity for AI evaluation.
npx skill4agent add yonatangross/orchestkit golden-dataset-validation{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"required": ["id", "title", "source_url", "content_type", "sections"],
"properties": {
"id": {
"type": "string",
"pattern": "^[a-z0-9-]+$",
"description": "Unique kebab-case identifier"
},
"title": {
"type": "string",
"minLength": 10,
"maxLength": 200
},
"source_url": {
"type": "string",
"format": "uri",
"description": "Canonical source URL (NOT placeholder)"
},
"content_type": {
"type": "string",
"enum": ["article", "tutorial", "research_paper", "documentation", "video_transcript", "code_repository"]
},
"bucket": {
"type": "string",
"enum": ["short", "long"]
},
"tags": {
"type": "array",
"items": {"type": "string"},
"minItems": 2,
"maxItems": 10
},
"sections": {
"type": "array",
"minItems": 1,
"items": {
"type": "object",
"required": ["id", "title", "content"],
"properties": {
"id": {"type": "string", "pattern": "^[a-z0-9-/]+$"},
"title": {"type": "string"},
"content": {"type": "string", "minLength": 50},
"granularity": {"enum": ["coarse", "fine", "summary"]}
}
}
}
}
}{
"type": "object",
"required": ["id", "query", "difficulty", "expected_chunks", "min_score"],
"properties": {
"id": {"type": "string", "pattern": "^q-[a-z0-9-]+$"},
"query": {"type": "string", "minLength": 5, "maxLength": 500},
"modes": {"type": "array", "items": {"enum": ["semantic", "keyword", "hybrid"]}},
"category": {"enum": ["specific", "broad", "negative", "edge", "coarse-to-fine"]},
"difficulty": {"enum": ["trivial", "easy", "medium", "hard", "adversarial"]},
"expected_chunks": {"type": "array", "items": {"type": "string"}, "minItems": 1},
"min_score": {"type": "number", "minimum": 0, "maximum": 1}
}
}| Rule | Purpose | Severity |
|---|---|---|
| No Placeholder URLs | Ensure real canonical URLs | Error |
| Unique Identifiers | No duplicate doc/query/section IDs | Error |
| Referential Integrity | Query chunks reference valid sections | Error |
| Content Quality | Title/content length, tag count | Warning |
| Difficulty Distribution | Balanced query difficulty levels | Warning |
| Similarity | Action |
|---|---|
| >= 0.90 | Block - Content too similar |
| >= 0.85 | Warn - High similarity detected |
| >= 0.80 | Note - Similar content exists |
| < 0.80 | Allow - Sufficiently unique |
| Metric | Minimum |
|---|---|
| Tutorials | >= 15% of documents |
| Research papers | >= 5% of documents |
| Domain coverage | >= 5 docs per expected domain |
| Hard queries | >= 10% of queries |
| Adversarial queries | >= 5% of queries |
| Level | Minimum Count |
|---|---|
| trivial | 3 |
| easy | 3 |
| medium | 5 |
| hard | 3 |
references/validation-rules.mdreferences/quality-metrics.mdgolden-dataset-curationgolden-dataset-managementpgvector-search