Loading...
Loading...
Compare original and translation side by side
/add-golden https://example.com/article
/add-golden https://arxiv.org/abs/2312.xxxxx/add-golden https://example.com/article
/add-golden https://arxiv.org/abs/2312.xxxxxundefinedundefined
---
---| Phase | Activities | Output |
|---|---|---|
| 1. Input Collection | Get URL, detect content type | Document metadata |
| 2. Fetch and Extract | Parse document structure | Structured content |
| 3. Quality Analysis | 4 parallel agents evaluate | Raw scores |
| 4. Quality Explanation | Explain WHY each score | Score rationale |
| 5. Bias Detection | Check for bias in content | Bias report |
| 6. Diversity Check | Assess dataset balance | Diversity metrics |
| 7. Validation | Schema, duplicates, gates | Validation status |
| 8. Silver-to-Gold | Promote or mark as silver | Classification |
| 9. Version Tracking | Track changes, rollback | Version entry |
| 阶段 | 活动 | 输出 |
|---|---|---|
| 1. 输入收集 | 获取URL,检测内容类型 | 文档元数据 |
| 2. 获取与提取 | 解析文档结构 | 结构化内容 |
| 3. 质量分析 | 4个并行Agent评估 | 原始分数 |
| 4. 质量分数说明 | 解释每个分数的原因 | 分数依据 |
| 5. 偏差检测 | 检查内容中的偏差 | 偏差报告 |
| 6. 多样性检查 | 评估数据集平衡性 | 多样性指标 |
| 7. 验证 | Schema检查、重复项检查、准入校验 | 验证状态 |
| 8. 从白银到黄金 | 升级为黄金或标记为白银 | 分类结果 |
| 9. 版本跟踪 | 跟踪变更、支持回滚 | 版本记录 |
run_in_background=True| Agent | Focus | Output |
|---|---|---|
| code-quality-reviewer | Accuracy, coherence, depth, relevance | Quality scores |
| workflow-architect | Keyword directness, paraphrase, reasoning | Difficulty level |
| data-pipeline-engineer | Primary/secondary domains, skill level | Tags |
| test-generator | Direct, paraphrased, multi-hop queries | Test queries |
run_in_background=True| Agent | 关注重点 | 输出 |
|---|---|---|
| code-quality-reviewer | 准确性、连贯性、深度、相关性 | 质量分数 |
| workflow-architect | 关键词匹配度、转述质量、推理能力 | 难度等级 |
| data-pipeline-engineer | 主次领域、技能水平 | 标签 |
| test-generator | 直接查询、转述查询、多跳查询 | 测试查询 |
undefinedundefined
---
---| Bias Score | Action |
|---|---|
| 0-2 | Proceed normally |
| 3-5 | Add disclaimer |
| 6-8 | Require user review |
| 9-10 | Recommend against |
| 偏差分数 | 操作 |
|---|---|
| 0-2 | 正常推进 |
| 3-5 | 添加免责声明 |
| 6-8 | 需用户审核 |
| 9-10 | 建议拒绝 |
| Status | Criteria | Action |
|---|---|---|
| GOLD | Score >= 0.75, no bias | Add to main dataset |
| SILVER | Score 0.55-0.74 | Add to silver, track |
| REJECT | Score < 0.55 | Do not add |
| 状态 | 标准 | 操作 |
|---|---|---|
| 黄金 | 分数≥0.75,无偏差 | 添加至主数据集 |
| 白银 | 分数0.55-0.74 | 添加至白银数据集并跟踪 |
| 拒绝 | 分数<0.55 | 不添加 |
{
"version": "1.2.3",
"change_type": "ADD|UPDATE|REMOVE|PROMOTE",
"document_id": "doc-123",
"quality_score": 0.82,
"rollback_available": true
}| Update Type | Version Bump |
|---|---|
| Add/Update document | Patch (0.0.X) |
| Remove document | Minor (0.X.0) |
| Schema change | Major (X.0.0) |
{
"version": "1.2.3",
"change_type": "ADD|UPDATE|REMOVE|PROMOTE",
"document_id": "doc-123",
"quality_score": 0.82,
"rollback_available": true
}| 更新类型 | 版本升级规则 |
|---|---|
| 添加/更新文档 | 补丁版本(0.0.X) |
| 删除文档 | 小版本(0.X.0) |
| Schema变更 | 大版本(X.0.0) |
| Dimension | Weight |
|---|---|
| Accuracy | 0.25 |
| Coherence | 0.20 |
| Depth | 0.25 |
| Relevance | 0.30 |
quality_score = accuracy*0.25 + coherence*0.20 + depth*0.25 + relevance*0.30| 维度 | 权重 |
|---|---|
| 准确性 | 0.25 |
| 连贯性 | 0.20 |
| 深度 | 0.25 |
| 相关性 | 0.30 |
quality_score = accuracy*0.25 + coherence*0.20 + depth*0.25 + relevance*0.30| Decision | Choice | Rationale |
|---|---|---|
| Score explanation | Required | Transparency, actionable feedback |
| Bias detection | Dedicated agent | Prevent dataset contamination |
| Two-tier system | Silver + Gold | Allow docs time to mature |
| Version tracking | Semantic versioning | Clear history, safe rollbacks |
| 决策 | 选择 | 理由 |
|---|---|---|
| 分数说明 | 必填 | 透明化、可落地的反馈 |
| 偏差检测 | 专用Agent | 防止数据集污染 |
| 双层体系 | 白银+黄金 | 让文档有成熟的时间 |
| 版本跟踪 | 语义化版本控制 | 清晰的历史记录、安全回滚 |
golden-dataset-validationllm-evaluationtest-data-managementgolden-dataset-validationllm-evaluationtest-data-management