document-quality-standards

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Document Quality Standards

文档质量标准

Patterns that complement the official
document-skills
plugin. Apply these alongside xlsx, pdf, docx, and pptx skills.
这些模式是官方
document-skills
插件的补充。可与xlsx、pdf、docx和pptx技能配合使用。

Visual-First Verification

视觉优先验证

Core principle: Text extraction misses critical details. Always verify visually.
"Only do python printing as a last resort because you will miss important details with text extraction (e.g. figures, tables, diagrams)."
核心原则:文本提取会遗漏关键细节。始终通过视觉方式进行验证。
"仅在万不得已时才使用Python打印,因为文本提取会遗漏重要细节(如图表、表格、图示)。"

The Render-Inspect-Fix Loop

渲染-检查-修复循环

For ANY document operation (create, edit, convert):
1. Generate/modify document
2. Convert to PNG:
   pdftoppm -png -r 150 document.pdf output
3. Visually inspect the PNG at 100% zoom
4. Fix any issues found
5. REPEAT until clean
Never deliver a document without PNG verification. This catches:
  • Clipped or overlapping text
  • Broken tables
  • Missing figures
  • Formatting inconsistencies
  • Orphans/widows
  • Unreadable characters
对于任何文档操作(创建、编辑、转换):
1. 生成/修改文档
2. 转换为PNG格式:
   pdftoppm -png -r 150 document.pdf output
3. 以100%缩放比例视觉检查PNG文件
4. 修复发现的所有问题
5. 重复上述步骤直至文档无问题
交付文档前必须完成PNG验证。这可以发现以下问题:
  • 文本被截断或重叠
  • 表格损坏
  • 图表缺失
  • 格式不一致
  • 孤立行/孤立词
  • 无法识别的字符

Quick Conversion Commands

快速转换命令

bash
undefined
bash
undefined

DOCX → PDF → PNG

DOCX → PDF → PNG

soffice --headless --convert-to pdf document.docx pdftoppm -png -r 150 document.pdf page
soffice --headless --convert-to pdf document.docx pdftoppm -png -r 150 document.pdf page

PDF → PNG directly

PDF 直接转 PNG

pdftoppm -png -r 150 document.pdf page
pdftoppm -png -r 150 document.pdf page

PPTX → PDF → PNG

PPTX → PDF → PNG

soffice --headless --convert-to pdf presentation.pptx pdftoppm -png -r 150 presentation.pdf slide
undefined
soffice --headless --convert-to pdf presentation.pptx pdftoppm -png -r 150 presentation.pdf slide
undefined

Typography Hygiene

排版规范

Hyphen Safety

连字符安全

Never use non-breaking hyphens (U+2011). They cause rendering failures in many viewers.
python
undefined
切勿使用非断字连字符(U+2011)。它们会在许多查看器中导致渲染失败。
python
undefined

WRONG - may render as boxes or break layouts

错误示例 - 可能显示为方框或破坏布局

text = "co‑author" # U+2011 non-breaking hyphen
text = "co‑author" # U+2011 非断字连字符

CORRECT - always use ASCII hyphen

正确示例 - 始终使用ASCII连字符

text = "co-author" # U+002D standard hyphen-minus

**Detection and fix**:
```python
text = "co-author" # U+002D 标准连字符减号

**检测与修复**:
```python

Find problematic hyphens

查找有问题的连字符

import re if '\u2011' in text: text = text.replace('\u2011', '-')
import re if '\u2011' in text: text = text.replace('\u2011', '-')

Also watch for other non-ASCII dashes

同时注意其他非ASCII破折号

text = text.replace('\u2013', '-') # en-dash text = text.replace('\u2014', '-') # em-dash (if hyphen intended)
undefined
text = text.replace('\u2013', '-') # 短破折号 text = text.replace('\u2014', '-') # 长破折号(如果用作连字符)
undefined

Citation Format

引用格式

All citations must be human-readable in standard scholarly format:
  • No internal tool tokens (e.g.,
    【4:2†source】
    )
  • No malformed references
  • Include: Author, Title, Source, Date, URL (if applicable)
undefined
所有引用必须采用标准学术格式,便于人类阅读:
  • 不使用内部工具标记(如
    【4:2†source】
  • 不使用格式错误的参考文献
  • 包含:作者、标题、来源、日期、URL(如有)
undefined

WRONG

错误示例

See source 【4:2†source】 for details.
详情请见来源【4:2†source】。

CORRECT

正确示例

See Smith (2024), "Document Standards," Journal of Tech, p. 45.
undefined
详情请见Smith(2024),《文档标准》,《技术期刊》,第45页。
undefined

Spreadsheet Formula Patterns

电子表格公式模式

Complements the xlsx skill's color conventions with additional patterns.
是xlsx技能颜色约定的补充,增加了额外模式。

Extended Color Codes

扩展颜色代码

Beyond the standard 5 colors (blue inputs, black formulas, green cross-sheet, red external, yellow assumptions):
ColorMeaningUse Case
Gray textStatic constantsValues that never change (tax rates, conversion factors)
Orange backgroundReview/cautionCells needing verification or approval
Light red backgroundErrors/issuesKnown problems to fix
除标准的5种颜色(蓝色输入、黑色公式、绿色跨工作表、红色外部链接、黄色假设)外:
颜色含义使用场景
灰色文本静态常量永不改变的值(税率、换算系数)
橙色背景待审核/注意需要验证或批准的单元格
浅红色背景错误/问题已知待修复的问题

Formula Simplicity

公式简洁性

Use helper cells instead of complex nested formulas.
undefined
使用辅助单元格替代复杂嵌套公式
undefined

WRONG - hard to debug, audit, or modify

错误示例 - 难以调试、审计或修改

=IF(AND(B5>100,C5<50),B51.1IF(D5="A",1.2,1),B5*0.9)
=IF(AND(B5>100,C5<50),B51.1IF(D5="A",1.2,1),B5*0.9)

CORRECT - use helper columns

正确示例 - 使用辅助列

E5: =B5>100 (Threshold check) F5: =C5<50 (Secondary check) G5: =IF(D5="A",1.2,1) (Category multiplier) H5: =IF(AND(E5,F5),B51.1G5,B5*0.9) (Final calculation)

Benefits:
- Each step is auditable
- Errors are easier to trace
- Business logic is visible
- Modifications are safer
E5: =B5>100 (阈值检查) F5: =C5<50 (二次检查) G5: =IF(D5="A",1.2,1) (类别乘数) H5: =IF(AND(E5,F5),B51.1G5,B5*0.9) (最终计算)

优势:
- 每个步骤均可审计
- 错误更易追踪
- 业务逻辑清晰可见
- 修改更安全

Avoid Dynamic Array Functions

避免动态数组函数

For maximum compatibility, avoid:
  • FILTER()
    - not supported in older Excel
  • XLOOKUP()
    - Excel 365+ only
  • SORT()
    - dynamic array function
  • SEQUENCE()
    - dynamic array function
  • UNIQUE()
    - dynamic array function
Use classic equivalents:
  • FILTER()
    INDEX/MATCH
    with helper columns
  • XLOOKUP()
    INDEX/MATCH
  • SORT()
    → manual sorting or helper columns
  • SEQUENCE()
    → manually entered row numbers
为了最大兼容性,避免使用:
  • FILTER()
    - 旧版Excel不支持
  • XLOOKUP()
    - 仅Excel 365+支持
  • SORT()
    - 动态数组函数
  • SEQUENCE()
    - 动态数组函数
  • UNIQUE()
    - 动态数组函数
使用经典替代方案:
  • FILTER()
    → 结合辅助列的
    INDEX/MATCH
  • XLOOKUP()
    INDEX/MATCH
  • SORT()
    → 手动排序或辅助列
  • SEQUENCE()
    → 手动输入行号

Finance-Specific Formatting

金融专用格式

Additional to xlsx skill standards:
python
undefined
在xlsx技能标准基础上增加以下内容:
python
undefined

Hide gridlines for cleaner appearance

隐藏网格线以获得更简洁的外观

sheet.sheet_view.showGridLines = False
sheet.sheet_view.showGridLines = False

Add borders above totals (not around every cell)

在总计行上方添加边框(而非每个单元格都加边框)

from openpyxl.styles import Border, Side thin_top = Border(top=Side(style='thin')) total_cell.border = thin_top
from openpyxl.styles import Border, Side thin_top = Border(top=Side(style='thin')) total_cell.border = thin_top

Cite sources in cell comments, not adjacent cells

在单元格批注中注明来源,而非相邻单元格

from openpyxl.comments import Comment cell.comment = Comment("Source: 10-K FY2024, p.45", "Analyst")
undefined
from openpyxl.comments import Comment cell.comment = Comment("来源:2024财年10-K报告,第45页", "分析师")
undefined

Quality Checklist

质量检查表

Before delivering any document:
  • PNG verification completed at 100% zoom
  • No clipped or overlapping text
  • Tables render correctly
  • Figures/images display properly
  • No U+2011 or problematic Unicode
  • Citations are human-readable
  • Formulas use helper cells where complex
  • No Excel formula errors (#REF!, #DIV/0!, etc.)
  • Professional, client-ready appearance
交付任何文档前,请确认:
  • 已完成100%缩放比例的PNG验证
  • 无文本被截断或重叠
  • 表格渲染正确
  • 图表/图像显示正常
  • 无U+2011或其他问题Unicode字符
  • 引用便于人类阅读
  • 复杂公式使用了辅助单元格
  • 无Excel公式错误(#REF!、#DIV/0!等)
  • 外观专业,符合客户交付标准

Integration with Official Skills

与官方技能的集成

This skill adds patterns on top of the document-skills plugin:
Official SkillThis Skill Adds
xlsx
Helper cells, extended colors, dynamic array warnings
pdf
Visual-first philosophy, render-inspect-fix loop
docx
Typography hygiene, PNG verification emphasis
pptx
Same verification workflow
Always read both this skill AND the relevant official skill when working with documents.
本技能在
document-skills
插件基础上增加了以下模式:
官方技能本技能新增内容
xlsx
辅助单元格、扩展颜色、动态数组函数警告
pdf
视觉优先理念、渲染-检查-修复循环
docx
排版规范、强调PNG验证
pptx
相同的验证工作流
处理文档时,请务必同时阅读本技能及相关官方技能