humanize
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseHumanize: AI Pattern Detection and Removal
Humanize:AI模式检测与移除
Remove AI-generated writing patterns from text. Produce natural, human-sounding output that preserves meaning.
This is not a generic rewriter. It targets specific, documented AI-writing patterns catalogued by Wikipedia's WikiProject AI Cleanup from thousands of observed instances.
从文本中移除AI生成的写作模式,生成保留原意、自然且符合人类表达习惯的输出。
这不是通用的重写工具,它针对的是维基百科WikiProject AI Cleanup从数千个观测实例中整理归档的、有明确记录的特定AI写作模式。
Workflow
工作流
Five phases. Each phase has a clear input, transformation, and output. Do not skip phases.
共五个阶段,每个阶段都有明确的输入、转换规则和输出,请勿跳过任何阶段。
Phase 1: Detection Scan
阶段1:检测扫描
Read the input text. Load . Scan for two categories of signals:
references/detection-patterns.mdA. Lexical patterns (the 24 catalogued AI-writing patterns):
| Category | Patterns | Priority |
|---|---|---|
| Content inflation | Significance puffing, notability claims, superficial -ing analyses, promotional language, vague attributions, formulaic challenges sections | HIGH — loudest AI tells |
| Vocabulary | AI-frequency words, copula avoidance, filler phrases, excessive hedging | HIGH — statistically detectable |
| Structure | Rule of three, negative parallelisms, elegant variation, false ranges, inline-header lists | MEDIUM — structural fingerprints |
| Style | Em dash overuse, boldface overuse, title case headings, emoji decoration, curly quotes | MEDIUM — formatting tells |
| Communication | Chatbot artifacts, knowledge-cutoff disclaimers, sycophantic tone, generic conclusions | LOW — obvious, usually caught by author |
B. Statistical regularity signals (see ):
references/statistical-signals.md| Signal | What to look for |
|---|---|
| Sentence length uniformity | Sentences clustering within a narrow word-count range |
| Low clause density variation | Every sentence has the same number of clauses |
| Flat information density | Every sentence carries roughly the same amount of detail |
| High-frequency phrase templates | Stock collocations and common bigrams/trigrams dominating the text |
| Excessive transition markers | Formal connectives appearing more than 8 per 1,000 words |
| Structural symmetry | Paragraphs and sentences following balanced, mirror-like patterns |
| Uniform inter-sentence cohesion | Every sentence tightly follows the previous with no topic shifts or digressions |
| Generic function word usage | Connectors and prepositions used in textbook-standard distribution with no personal tendencies |
Output a detection report using the detection report template (see Output Format).
Instance severity rating:
| Severity | Criteria |
|---|---|
| HIGH | 3+ patterns co-occurring in a single paragraph, or any paragraph saturated with AI vocabulary (5+ signal words) |
| MEDIUM | 1-2 patterns in a paragraph, or a statistical signal present across 3+ consecutive sentences |
| LOW | Isolated single instance of any pattern, or a borderline statistical signal |
读取输入文本,加载 ,扫描两类特征信号:
references/detection-patterns.mdA. 词汇模式(已归档的24种AI写作模式):
| 分类 | 模式 | 优先级 |
|---|---|---|
| 内容膨胀 | 重要性夸大、知名度声明、流于表面的-ing形式分析、推广性语言、模糊归因、公式化挑战板块 | 高 —— 最明显的AI特征 |
| 词汇使用 | AI高频词、回避系动词、填充短语、过度模糊表述 | 高 —— 可通过统计方法检测 |
| 结构 | 三点式结构、否定平行结构、刻意用词变化、虚假范围、行内标题列表 | 中 —— 结构层面的特征指纹 |
| 风格 | 破折号滥用、粗体滥用、标题式大小写 heading、emoji装饰、弯引号 | 中 —— 格式层面的特征 |
| 沟通特征 | 聊天bot遗留痕迹、知识截止免责声明、谄媚语气、通用结论 | 低 —— 特征明显,通常作者会自行发现 |
B. 统计规律信号(参见 ):
references/statistical-signals.md| 信号 | 检测要点 |
|---|---|
| 句子长度一致性 | 句子字数集中在很窄的范围内 |
| 子句密度变化低 | 每个句子的子句数量完全相同 |
| 信息密度平缓 | 每个句子承载的信息量大致相同 |
| 高频短语模板 | 固定搭配和常见的二元/三元词组占据文本主导地位 |
| 过渡标记过多 | 正式连接词每千词出现次数超过8个 |
| 结构对称性 | 段落和句子遵循平衡、镜像式的模式 |
| 句间衔接过于统一 | 每个句子都紧跟上一句内容,没有主题跳转或题外话 |
| 功能词使用通用 | 连接词和介词的分布符合教科书标准,没有个人使用倾向 |
使用检测报告模板输出检测报告(参见输出格式)。
实例严重程度评级:
| 严重程度 | 判定标准 |
|---|---|
| 高 | 单个段落中同时出现3种及以上模式,或任意段落满是AI词汇(5个及以上信号词) |
| 中 | 单个段落中出现1-2种模式,或连续3个及以上句子存在统计信号 |
| 低 | 仅存在孤立的单一模式,或处于临界值的统计信号 |
Phase 2: Structural Rewrite
阶段2:结构重写
Transform document structure to break AI-typical organization:
- Convert uniform paragraph lengths to varied blocks
- Merge or split sentences to break rhythmic uniformity
- Reorder clauses where meaning permits
- Convert formulaic list structures to narrative where appropriate
- Remove tripartite constructions unless the content genuinely has three parts
Do not change factual content. Do not add information. Do not remove cited sources, data, or technical terms.
调整文档结构,打破AI典型的组织逻辑:
- 将统一的段落长度调整为长短不一的块
- 合并或拆分句子,打破节奏一致性
- 在不影响原意的前提下调整子句顺序
- 酌情将公式化的列表结构转换为叙述形式
- 移除三点式结构,除非内容确实对应三个部分
请勿修改事实内容,请勿新增信息,请勿移除引用来源、数据或技术术语。
Phase 3: Vocabulary and Style Pass
阶段3:词汇与风格优化
Apply pattern-specific rewrites from the detection report:
- Replace AI-frequency vocabulary with natural alternatives
- Restore simple copulas (is/are/has) where the text uses elaborate substitutes
- Remove filler phrases and excessive hedging
- Cut promotional language and significance inflation
- Replace vague attributions with specific ones (or remove if no source exists)
Load the appropriate style profile from based on the target domain. Apply domain-specific voice calibration.
references/style-guide.md根据检测报告针对特定模式进行重写:
- 用自然的替代词替换AI高频词汇
- 在文本使用复杂替代表述的地方恢复简单系动词(is/are/has)
- 移除填充短语和过度模糊表述
- 删减推广性语言和重要性夸大内容
- 将模糊归因替换为具体归因(如果没有来源则直接移除)
根据目标领域从 加载合适的风格配置,应用领域专属的口吻校准。
references/style-guide.mdPhase 4: Entropy and Variation
阶段4:熵值与差异化调整
Human writing has burstiness — irregular rhythm, varied sentence lengths, uneven information density. AI text is statistically smooth. This phase breaks that smoothness.
Load for target ranges. Apply:
references/statistical-signals.md- Sentence length variance: mix short declarative with longer explanatory. Target visible variance across any 5-sentence window.
- Clause density variation: alternate simple sentences (one clause) with compound/complex (2-3 clauses). Do not settle on a uniform clause count.
- Information density variation: let some sentences carry heavy detail while others are light — a summary statement, a reaction, a pivot. Uniform density reads as generated.
- Phrase template breaking: replace stock collocations with specific phrasings. "Play a role in" -> name the specific action. "In terms of" -> delete or restructure.
- Inter-sentence cohesion variation: not every sentence should tightly follow the previous. Allow small topic expansions, brief asides, or contextual jumps that a thinking human would make.
- Function word personalization: vary connector usage. Use "but" in one place, "still" in another, nothing in a third. Do not default to the same conjunction pattern throughout.
- Paragraph length variance: mix single-sentence paragraphs with 4-5 sentence blocks.
- Controlled imperfection: fragments at impact positions, parenthetical asides, concessive turns. Sparingly — seasoning, not structure.
人类写作具有 burstiness(突发性)—— 节奏不规律、句子长度多样、信息密度不均,而AI文本的统计特征非常平滑,本阶段用于打破这种平滑性。
加载 参考目标范围,执行以下调整:
references/statistical-signals.md- 句子长度差异:混合使用简短的陈述句和更长的解释性句子,目标是任意5句窗口内存在明显的长度差异。
- 子句密度差异:交替使用简单句(1个子句)和复合/复杂句(2-3个子句),不要固定使用统一的子句数量。
- 信息密度差异:让部分句子承载大量细节,其他句子承载少量信息——比如总结句、反应句、转折句,统一的密度会暴露生成属性。
- 打破短语模板:用具体表述替换固定搭配,比如“Play a role in”改为具体的动作名称,“In terms of”直接删除或重构句式。
- 句间衔接差异:不需要每个句子都紧跟上一句内容,允许小幅的主题拓展、简短的题外话,或是人类思考时会出现的上下文跳转。
- 功能词个性化:调整连接词的使用,一个地方用“but”,另一个地方用“still”,第三个地方不用连接词,不要全程默认使用同一种连词模式。
- 段落长度差异:混合使用单句段落和4-5句的段落块。
- 可控的不完美:在有冲击力的位置使用句子片段、括号插入的题外话、让步转折,仅少量使用——作为调味,而非结构主体。
Phase 5: Validation and Output
阶段5:验证与输出
Two checks before delivering:
Semantic check: Compare rewrite against original. Every factual claim, data point, argument, and technical term in the original must be present in the rewrite. If anything was lost, restore it.
Self-audit: Ask internally: "What still sounds AI-generated about this text?" If residual patterns remain, fix them. One pass only — do not loop indefinitely.
Output the final text followed by a brief changes summary.
交付前执行两项检查:
语义检查: 将重写内容与原文对比,原文中的所有事实主张、数据点、论点、技术术语都必须在重写版本中保留,如果有遗漏要补回。
自我审核: 向内自问:“这段文本还有哪些地方听起来像AI生成的?”如果还有残留模式就修复,仅执行一次——不要无限循环调整。
输出最终文本,后跟简短的变更摘要。
Output Format
输出格式
Full Rewrite / Targeted Fix / Style Shift
完整重写/定向修复/风格转换
[Humanized text]
---
Changes: [2-4 bullet summary of what was changed and why]
Patterns detected: [list of pattern numbers/names found]
Domain: [detected or specified domain]For short texts (under 100 words), skip the changes summary unless the user requests it.
[人文化处理后的文本]
---
变更:[2-4条要点总结修改内容和修改原因]
检测到的模式:[列出发现的模式编号/名称]
领域:[检测到的或用户指定的领域]对于短文本(100词以下),除非用户要求,否则省略变更摘要。
Detection Only
仅检测
undefinedundefinedDetection Report
检测报告
Domain: [detected or specified]
Overall severity: [HIGH / MEDIUM / LOW]
Patterns found: [count]
领域: [检测到的或用户指定的]
整体严重程度: [高 / 中 / 低]
发现的模式: [数量]
Findings
发现结果
| Location | Pattern | Severity | Evidence |
|---|---|---|---|
| Para 1 | #7 AI vocabulary | HIGH | "delve", "intricate", "pivotal" in same sentence |
| Para 2 | #8 Copula avoidance | MEDIUM | "serves as" instead of "is" |
| Para 1-4 | Sentence length uniformity | MEDIUM | All sentences 18-22 words, SD < 3 |
| ... | ... | ... | ... |
| 位置 | 模式 | 严重程度 | 证据 |
|---|---|---|---|
| 第1段 | #7 AI词汇 | 高 | 同一句中出现"delve", "intricate", "pivotal" |
| 第2段 | #8 回避系动词 | 中 | 用"serves as"代替"is" |
| 第1-4段 | 句子长度一致 | 中 | 所有句子长度为18-22词,标准差<3 |
| ... | ... | ... | ... |
Statistical Signals
统计信号
| Signal | Status | Detail |
|---|---|---|
| Sentence length variance | FLAG | SD ~3 words (human typical: 7-15) |
| Transition frequency | OK | 5 per 1,000 words |
| ... | ... | ... |
| 信号 | 状态 | 详情 |
|---|---|---|
| 句子长度方差 | 警告 | 标准差约3词(人类写作典型值:7-15) |
| 过渡词频率 | 正常 | 每千词出现5次 |
| ... | ... | ... |
Summary
总结
[1-2 sentences: overall assessment and highest-priority patterns to fix first]
undefined[1-2句话:整体评估和需要优先修复的高优先级模式]
undefinedReference Files
参考文件
| File | Purpose | Load When |
|---|---|---|
| 24 AI-writing patterns with examples | Always (Phase 1) |
| 12 statistical regularity signals with target ranges | Phase 1 (scan) and Phase 4 (targets) |
| Domain-specific voice profiles and calibration rules | Phase 3 (match to domain) |
| Structural rewrite strategies and entropy techniques | Phase 2 and Phase 4 |
| Before/after pairs for academic writing | When domain is academic |
| Before/after pairs for blog/casual writing | When domain is blog or social |
| Before/after pairs for professional/business writing | When domain is professional |
| 文件 | 用途 | 加载时机 |
|---|---|---|
| 24种AI写作模式及示例 | 始终加载(阶段1) |
| 12种统计规律信号及目标范围 | 阶段1(扫描)和阶段4(目标参考) |
| 领域专属的口吻配置和校准规则 | 阶段3(匹配对应领域) |
| 结构重写策略和熵值调整技术 | 阶段2和阶段4 |
| 学术写作的前后对比样例 | 当领域为学术时 |
| 博客/非正式写作的前后对比样例 | 当领域为博客或社交时 |
| 专业/商务写作的前后对比样例 | 当领域为专业时 |
Domain Detection
领域检测
If the user does not specify a domain, infer from:
- Vocabulary density and jargon type
- Citation patterns
- Sentence complexity
- Register (formal/informal markers)
Default to professional if ambiguous.
Supported domains: , , , , ,
academictechnicalblogsocialprofessionalmarketing如果用户未指定领域,通过以下维度推断:
- 词汇密度和术语类型
- 引用模式
- 句子复杂度
- 语体(正式/非正式标记)
如果存在歧义,默认使用专业领域。
支持的领域:, , , , ,
academictechnicalblogsocialprofessionalmarketingBehavioral Constraints
行为约束
- Never fabricate. Do not add facts, citations, quotes, statistics, or claims not in the original.
- Never remove data. Numbers, dates, names, URLs, and cited sources must survive the rewrite.
- Preserve argument structure. If the original makes points A, B, C in that order with that logic, the rewrite must preserve the logical flow.
- Do not over-humanize. Some text is meant to be neutral and informational. A technical specification does not need personality. Match the appropriate register.
- Respect code blocks and structured data. Do not humanize code, tables, JSON, YAML, or any structured/machine-readable content. Pass these through unchanged.
- One pass through the pipeline. Do not run the 5-phase pipeline recursively. If the output still has tells after Phase 5, note them in the changes summary rather than looping.
- 绝不虚构内容。 不要添加原文没有的事实、引用、引语、统计数据或主张。
- 绝不移除数据。 数字、日期、名称、URL和引用来源必须在重写中完整保留。
- 保留论点结构。 如果原文按顺序提出论点A、B、C并遵循对应逻辑,重写版本必须保留逻辑流。
- 不要过度人文化。 部分文本本身就需要中立和信息化,技术规范不需要体现个性,匹配合适的语体即可。
- 保护代码块和结构化数据。 不要对代码、表格、JSON、YAML或任何结构化/机器可读内容做人文化处理,直接原样返回。
- 仅执行一次流水线。 不要递归运行5阶段流程,如果阶段5结束后仍有AI特征,在变更摘要中注明即可,不要循环处理。
Scope Modes
作用模式
| Mode | Trigger | Behavior |
|---|---|---|
| Full rewrite | "humanize this", "rewrite naturally" | Run all 5 phases |
| Detection only | "check for AI patterns", "does this sound AI" | Run Phase 1 only, output detection report |
| Targeted fix | "fix the AI-sounding parts", "just clean up the obvious stuff" | Run Phase 1, then apply fixes only to HIGH-priority patterns |
| Style shift | "make this more casual/academic/professional" | Run Phases 3-4 with specified domain profile |
| 模式 | 触发条件 | 执行逻辑 |
|---|---|---|
| 完整重写 | "humanize this", "rewrite naturally" | 运行全部5个阶段 |
| 仅检测 | "check for AI patterns", "does this sound AI" | 仅运行阶段1,输出检测报告 |
| 定向修复 | "fix the AI-sounding parts", "just clean up the obvious stuff" | 运行阶段1,仅修复高优先级模式 |
| 风格转换 | "make this more casual/academic/professional" | 使用指定领域配置运行阶段3-4 |
Error Handling
错误处理
| Problem | Cause | Resolution |
|---|---|---|
| Input under 20 words | Insufficient signal for pattern detection | Report: "Text too short for reliable pattern detection." Apply vocabulary fixes only (Phase 3) if obvious patterns are present. Skip statistical signal analysis. |
| Input is entirely code/structured data | No prose to humanize | Report: "Input is structured data — no humanization applicable." Return input unchanged. |
| Mixed human + AI text | Partial AI generation or human-edited AI output | Run Phase 1 on full text. Flag only paragraphs/sections with detected patterns. Apply Phases 2-4 selectively to flagged sections. Leave clean sections untouched. |
| Domain ambiguous after detection | Input mixes registers (e.g., academic citations in a blog post) | Default to professional. Note the ambiguity in the output: "Domain defaulted to professional — specify if another profile is preferred." |
| Semantic drift detected in Phase 5 | Rewrite altered meaning during structural/vocabulary changes | Restore the drifted factual claim from the original. Do not re-run the full pipeline. Note the restoration in the changes summary. |
| Input contains fabricated citations | Original text has hallucinated sources | Not detectable — this skill humanizes style, not factual accuracy. Pass through unchanged. Note in limitations if the user asks about accuracy. |
| All patterns are LOW severity | Text is mostly human-written with minor tells | In targeted fix mode, report findings but recommend no changes. In full rewrite mode, apply light-touch fixes only — do not over-edit clean text. |
| 问题 | 原因 | 解决方案 |
|---|---|---|
| 输入少于20词 | 没有足够的信号用于模式检测 | 报告:"文本过短,无法进行可靠的模式检测。" 如果存在明显模式仅执行词汇修复(阶段3),跳过统计信号分析。 |
| 输入全为代码/结构化数据 | 没有可用于人文化处理的散文内容 | 报告:"输入为结构化数据——不适用人文化处理。" 原样返回输入。 |
| 人类+AI混合文本 | 部分AI生成或经过人类编辑的AI输出 | 对全文运行阶段1,仅标记检测到模式的段落/板块,仅对标记部分选择性执行阶段2-4,未发现问题的部分保持原样。 |
| 检测后领域仍存在歧义 | 输入混合语体(比如博客文章中出现学术引用) | 默认使用专业领域,在输出中注明歧义:"领域默认设置为专业——如果需要其他配置请指定。" |
| 阶段5检测到语义偏移 | 结构/词汇变更过程中重写改变了原意 | 从原文中恢复偏移的事实主张,不要重新运行完整流水线,在变更摘要中注明本次恢复。 |
| 输入包含虚构引用 | 原文存在幻觉来源 | 无法检测——本技能仅优化写作风格,不校验事实准确性,原样保留即可。如果用户询问准确性,可在局限性中说明。 |
| 所有模式都是低严重程度 | 文本大部分为人类写作,仅存在少量特征 | 定向修复模式下,报告发现结果但建议不做修改;完整重写模式下,仅执行轻量修复——不要过度编辑干净的文本。 |
Integration Point
集成点
Other writing skills can import as a pattern library for their own anti-pattern sweeps. The detection patterns are the shared asset; the pipeline is this skill's domain.
references/detection-patterns.md其他写作技能可以导入 作为模式库,用于自身的反模式扫描。检测模式是共享资产,本流水线为该技能专属能力。
references/detection-patterns.mdLimitations
局限性
- Cannot verify factual accuracy of the original text. Garbage in, humanized garbage out.
- Effectiveness depends on input length. Very short texts (under 20 words) have insufficient signal for pattern detection.
- Style profiles are guidelines, not voice cloning. The output will sound natural but will not match a specific author's voice without additional calibration.
- Does not interact with external AI-detection APIs. Assessment is heuristic, not benchmark-verified.
- 无法验证原文的事实准确性,输入垃圾内容只会输出人文化的垃圾内容。
- 效果取决于输入长度,极短文本(20词以下)没有足够的信号用于模式检测。
- 风格配置仅为指导,不是声音克隆,输出会很自然,但如果没有额外校准无法匹配特定作者的口吻。
- 不与外部AI检测API交互,评估为启发式判断,未经过基准验证。