humanize

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Humanize: AI Pattern Detection and Removal

Humanize:AI模式检测与移除

Remove AI-generated writing patterns from text. Produce natural, human-sounding output that preserves meaning.
This is not a generic rewriter. It targets specific, documented AI-writing patterns catalogued by Wikipedia's WikiProject AI Cleanup from thousands of observed instances.
从文本中移除AI生成的写作模式,生成保留原意、自然且符合人类表达习惯的输出。
这不是通用的重写工具,它针对的是维基百科WikiProject AI Cleanup从数千个观测实例中整理归档的、有明确记录的特定AI写作模式。

Workflow

工作流

Five phases. Each phase has a clear input, transformation, and output. Do not skip phases.
共五个阶段,每个阶段都有明确的输入、转换规则和输出,请勿跳过任何阶段。

Phase 1: Detection Scan

阶段1:检测扫描

Read the input text. Load
references/detection-patterns.md
. Scan for two categories of signals:
A. Lexical patterns (the 24 catalogued AI-writing patterns):
CategoryPatternsPriority
Content inflationSignificance puffing, notability claims, superficial -ing analyses, promotional language, vague attributions, formulaic challenges sectionsHIGH — loudest AI tells
VocabularyAI-frequency words, copula avoidance, filler phrases, excessive hedgingHIGH — statistically detectable
StructureRule of three, negative parallelisms, elegant variation, false ranges, inline-header listsMEDIUM — structural fingerprints
StyleEm dash overuse, boldface overuse, title case headings, emoji decoration, curly quotesMEDIUM — formatting tells
CommunicationChatbot artifacts, knowledge-cutoff disclaimers, sycophantic tone, generic conclusionsLOW — obvious, usually caught by author
B. Statistical regularity signals (see
references/statistical-signals.md
):
SignalWhat to look for
Sentence length uniformitySentences clustering within a narrow word-count range
Low clause density variationEvery sentence has the same number of clauses
Flat information densityEvery sentence carries roughly the same amount of detail
High-frequency phrase templatesStock collocations and common bigrams/trigrams dominating the text
Excessive transition markersFormal connectives appearing more than 8 per 1,000 words
Structural symmetryParagraphs and sentences following balanced, mirror-like patterns
Uniform inter-sentence cohesionEvery sentence tightly follows the previous with no topic shifts or digressions
Generic function word usageConnectors and prepositions used in textbook-standard distribution with no personal tendencies
Output a detection report using the detection report template (see Output Format).
Instance severity rating:
SeverityCriteria
HIGH3+ patterns co-occurring in a single paragraph, or any paragraph saturated with AI vocabulary (5+ signal words)
MEDIUM1-2 patterns in a paragraph, or a statistical signal present across 3+ consecutive sentences
LOWIsolated single instance of any pattern, or a borderline statistical signal
读取输入文本,加载
references/detection-patterns.md
,扫描两类特征信号:
A. 词汇模式(已归档的24种AI写作模式):
分类模式优先级
内容膨胀重要性夸大、知名度声明、流于表面的-ing形式分析、推广性语言、模糊归因、公式化挑战板块高 —— 最明显的AI特征
词汇使用AI高频词、回避系动词、填充短语、过度模糊表述高 —— 可通过统计方法检测
结构三点式结构、否定平行结构、刻意用词变化、虚假范围、行内标题列表中 —— 结构层面的特征指纹
风格破折号滥用、粗体滥用、标题式大小写 heading、emoji装饰、弯引号中 —— 格式层面的特征
沟通特征聊天bot遗留痕迹、知识截止免责声明、谄媚语气、通用结论低 —— 特征明显,通常作者会自行发现
B. 统计规律信号(参见
references/statistical-signals.md
):
信号检测要点
句子长度一致性句子字数集中在很窄的范围内
子句密度变化低每个句子的子句数量完全相同
信息密度平缓每个句子承载的信息量大致相同
高频短语模板固定搭配和常见的二元/三元词组占据文本主导地位
过渡标记过多正式连接词每千词出现次数超过8个
结构对称性段落和句子遵循平衡、镜像式的模式
句间衔接过于统一每个句子都紧跟上一句内容,没有主题跳转或题外话
功能词使用通用连接词和介词的分布符合教科书标准,没有个人使用倾向
使用检测报告模板输出检测报告(参见输出格式)。
实例严重程度评级:
严重程度判定标准
单个段落中同时出现3种及以上模式,或任意段落满是AI词汇(5个及以上信号词)
单个段落中出现1-2种模式,或连续3个及以上句子存在统计信号
仅存在孤立的单一模式,或处于临界值的统计信号

Phase 2: Structural Rewrite

阶段2:结构重写

Transform document structure to break AI-typical organization:
  • Convert uniform paragraph lengths to varied blocks
  • Merge or split sentences to break rhythmic uniformity
  • Reorder clauses where meaning permits
  • Convert formulaic list structures to narrative where appropriate
  • Remove tripartite constructions unless the content genuinely has three parts
Do not change factual content. Do not add information. Do not remove cited sources, data, or technical terms.
调整文档结构,打破AI典型的组织逻辑:
  • 将统一的段落长度调整为长短不一的块
  • 合并或拆分句子,打破节奏一致性
  • 在不影响原意的前提下调整子句顺序
  • 酌情将公式化的列表结构转换为叙述形式
  • 移除三点式结构,除非内容确实对应三个部分
请勿修改事实内容,请勿新增信息,请勿移除引用来源、数据或技术术语。

Phase 3: Vocabulary and Style Pass

阶段3:词汇与风格优化

Apply pattern-specific rewrites from the detection report:
  • Replace AI-frequency vocabulary with natural alternatives
  • Restore simple copulas (is/are/has) where the text uses elaborate substitutes
  • Remove filler phrases and excessive hedging
  • Cut promotional language and significance inflation
  • Replace vague attributions with specific ones (or remove if no source exists)
Load the appropriate style profile from
references/style-guide.md
based on the target domain. Apply domain-specific voice calibration.
根据检测报告针对特定模式进行重写:
  • 用自然的替代词替换AI高频词汇
  • 在文本使用复杂替代表述的地方恢复简单系动词(is/are/has)
  • 移除填充短语和过度模糊表述
  • 删减推广性语言和重要性夸大内容
  • 将模糊归因替换为具体归因(如果没有来源则直接移除)
根据目标领域从
references/style-guide.md
加载合适的风格配置,应用领域专属的口吻校准。

Phase 4: Entropy and Variation

阶段4:熵值与差异化调整

Human writing has burstiness — irregular rhythm, varied sentence lengths, uneven information density. AI text is statistically smooth. This phase breaks that smoothness.
Load
references/statistical-signals.md
for target ranges. Apply:
  • Sentence length variance: mix short declarative with longer explanatory. Target visible variance across any 5-sentence window.
  • Clause density variation: alternate simple sentences (one clause) with compound/complex (2-3 clauses). Do not settle on a uniform clause count.
  • Information density variation: let some sentences carry heavy detail while others are light — a summary statement, a reaction, a pivot. Uniform density reads as generated.
  • Phrase template breaking: replace stock collocations with specific phrasings. "Play a role in" -> name the specific action. "In terms of" -> delete or restructure.
  • Inter-sentence cohesion variation: not every sentence should tightly follow the previous. Allow small topic expansions, brief asides, or contextual jumps that a thinking human would make.
  • Function word personalization: vary connector usage. Use "but" in one place, "still" in another, nothing in a third. Do not default to the same conjunction pattern throughout.
  • Paragraph length variance: mix single-sentence paragraphs with 4-5 sentence blocks.
  • Controlled imperfection: fragments at impact positions, parenthetical asides, concessive turns. Sparingly — seasoning, not structure.
人类写作具有 burstiness(突发性)—— 节奏不规律、句子长度多样、信息密度不均,而AI文本的统计特征非常平滑,本阶段用于打破这种平滑性。
加载
references/statistical-signals.md
参考目标范围,执行以下调整:
  • 句子长度差异:混合使用简短的陈述句和更长的解释性句子,目标是任意5句窗口内存在明显的长度差异。
  • 子句密度差异:交替使用简单句(1个子句)和复合/复杂句(2-3个子句),不要固定使用统一的子句数量。
  • 信息密度差异:让部分句子承载大量细节,其他句子承载少量信息——比如总结句、反应句、转折句,统一的密度会暴露生成属性。
  • 打破短语模板:用具体表述替换固定搭配,比如“Play a role in”改为具体的动作名称,“In terms of”直接删除或重构句式。
  • 句间衔接差异:不需要每个句子都紧跟上一句内容,允许小幅的主题拓展、简短的题外话,或是人类思考时会出现的上下文跳转。
  • 功能词个性化:调整连接词的使用,一个地方用“but”,另一个地方用“still”,第三个地方不用连接词,不要全程默认使用同一种连词模式。
  • 段落长度差异:混合使用单句段落和4-5句的段落块。
  • 可控的不完美:在有冲击力的位置使用句子片段、括号插入的题外话、让步转折,仅少量使用——作为调味,而非结构主体。

Phase 5: Validation and Output

阶段5:验证与输出

Two checks before delivering:
Semantic check: Compare rewrite against original. Every factual claim, data point, argument, and technical term in the original must be present in the rewrite. If anything was lost, restore it.
Self-audit: Ask internally: "What still sounds AI-generated about this text?" If residual patterns remain, fix them. One pass only — do not loop indefinitely.
Output the final text followed by a brief changes summary.
交付前执行两项检查:
语义检查: 将重写内容与原文对比,原文中的所有事实主张、数据点、论点、技术术语都必须在重写版本中保留,如果有遗漏要补回。
自我审核: 向内自问:“这段文本还有哪些地方听起来像AI生成的?”如果还有残留模式就修复,仅执行一次——不要无限循环调整。
输出最终文本,后跟简短的变更摘要。

Output Format

输出格式

Full Rewrite / Targeted Fix / Style Shift

完整重写/定向修复/风格转换

[Humanized text]

---
Changes: [2-4 bullet summary of what was changed and why]
Patterns detected: [list of pattern numbers/names found]
Domain: [detected or specified domain]
For short texts (under 100 words), skip the changes summary unless the user requests it.
[人文化处理后的文本]

---
变更:[2-4条要点总结修改内容和修改原因]
检测到的模式:[列出发现的模式编号/名称]
领域:[检测到的或用户指定的领域]
对于短文本(100词以下),除非用户要求,否则省略变更摘要。

Detection Only

仅检测

undefined
undefined

Detection Report

检测报告

Domain: [detected or specified] Overall severity: [HIGH / MEDIUM / LOW] Patterns found: [count]
领域: [检测到的或用户指定的] 整体严重程度: [高 / 中 / 低] 发现的模式: [数量]

Findings

发现结果

LocationPatternSeverityEvidence
Para 1#7 AI vocabularyHIGH"delve", "intricate", "pivotal" in same sentence
Para 2#8 Copula avoidanceMEDIUM"serves as" instead of "is"
Para 1-4Sentence length uniformityMEDIUMAll sentences 18-22 words, SD < 3
............
位置模式严重程度证据
第1段#7 AI词汇同一句中出现"delve", "intricate", "pivotal"
第2段#8 回避系动词用"serves as"代替"is"
第1-4段句子长度一致所有句子长度为18-22词,标准差<3
............

Statistical Signals

统计信号

SignalStatusDetail
Sentence length varianceFLAGSD ~3 words (human typical: 7-15)
Transition frequencyOK5 per 1,000 words
.........
信号状态详情
句子长度方差警告标准差约3词(人类写作典型值:7-15)
过渡词频率正常每千词出现5次
.........

Summary

总结

[1-2 sentences: overall assessment and highest-priority patterns to fix first]
undefined
[1-2句话:整体评估和需要优先修复的高优先级模式]
undefined

Reference Files

参考文件

FilePurposeLoad When
references/detection-patterns.md
24 AI-writing patterns with examplesAlways (Phase 1)
references/statistical-signals.md
12 statistical regularity signals with target rangesPhase 1 (scan) and Phase 4 (targets)
references/style-guide.md
Domain-specific voice profiles and calibration rulesPhase 3 (match to domain)
references/transformation-rules.md
Structural rewrite strategies and entropy techniquesPhase 2 and Phase 4
examples/academic.md
Before/after pairs for academic writingWhen domain is academic
examples/blog.md
Before/after pairs for blog/casual writingWhen domain is blog or social
examples/professional.md
Before/after pairs for professional/business writingWhen domain is professional
文件用途加载时机
references/detection-patterns.md
24种AI写作模式及示例始终加载(阶段1)
references/statistical-signals.md
12种统计规律信号及目标范围阶段1(扫描)和阶段4(目标参考)
references/style-guide.md
领域专属的口吻配置和校准规则阶段3(匹配对应领域)
references/transformation-rules.md
结构重写策略和熵值调整技术阶段2和阶段4
examples/academic.md
学术写作的前后对比样例当领域为学术时
examples/blog.md
博客/非正式写作的前后对比样例当领域为博客或社交时
examples/professional.md
专业/商务写作的前后对比样例当领域为专业时

Domain Detection

领域检测

If the user does not specify a domain, infer from:
  1. Vocabulary density and jargon type
  2. Citation patterns
  3. Sentence complexity
  4. Register (formal/informal markers)
Default to professional if ambiguous.
Supported domains:
academic
,
technical
,
blog
,
social
,
professional
,
marketing
如果用户未指定领域,通过以下维度推断:
  1. 词汇密度和术语类型
  2. 引用模式
  3. 句子复杂度
  4. 语体(正式/非正式标记)
如果存在歧义,默认使用专业领域。
支持的领域:
academic
,
technical
,
blog
,
social
,
professional
,
marketing

Behavioral Constraints

行为约束

  1. Never fabricate. Do not add facts, citations, quotes, statistics, or claims not in the original.
  2. Never remove data. Numbers, dates, names, URLs, and cited sources must survive the rewrite.
  3. Preserve argument structure. If the original makes points A, B, C in that order with that logic, the rewrite must preserve the logical flow.
  4. Do not over-humanize. Some text is meant to be neutral and informational. A technical specification does not need personality. Match the appropriate register.
  5. Respect code blocks and structured data. Do not humanize code, tables, JSON, YAML, or any structured/machine-readable content. Pass these through unchanged.
  6. One pass through the pipeline. Do not run the 5-phase pipeline recursively. If the output still has tells after Phase 5, note them in the changes summary rather than looping.
  1. 绝不虚构内容。 不要添加原文没有的事实、引用、引语、统计数据或主张。
  2. 绝不移除数据。 数字、日期、名称、URL和引用来源必须在重写中完整保留。
  3. 保留论点结构。 如果原文按顺序提出论点A、B、C并遵循对应逻辑,重写版本必须保留逻辑流。
  4. 不要过度人文化。 部分文本本身就需要中立和信息化,技术规范不需要体现个性,匹配合适的语体即可。
  5. 保护代码块和结构化数据。 不要对代码、表格、JSON、YAML或任何结构化/机器可读内容做人文化处理,直接原样返回。
  6. 仅执行一次流水线。 不要递归运行5阶段流程,如果阶段5结束后仍有AI特征,在变更摘要中注明即可,不要循环处理。

Scope Modes

作用模式

ModeTriggerBehavior
Full rewrite"humanize this", "rewrite naturally"Run all 5 phases
Detection only"check for AI patterns", "does this sound AI"Run Phase 1 only, output detection report
Targeted fix"fix the AI-sounding parts", "just clean up the obvious stuff"Run Phase 1, then apply fixes only to HIGH-priority patterns
Style shift"make this more casual/academic/professional"Run Phases 3-4 with specified domain profile
模式触发条件执行逻辑
完整重写"humanize this", "rewrite naturally"运行全部5个阶段
仅检测"check for AI patterns", "does this sound AI"仅运行阶段1,输出检测报告
定向修复"fix the AI-sounding parts", "just clean up the obvious stuff"运行阶段1,仅修复高优先级模式
风格转换"make this more casual/academic/professional"使用指定领域配置运行阶段3-4

Error Handling

错误处理

ProblemCauseResolution
Input under 20 wordsInsufficient signal for pattern detectionReport: "Text too short for reliable pattern detection." Apply vocabulary fixes only (Phase 3) if obvious patterns are present. Skip statistical signal analysis.
Input is entirely code/structured dataNo prose to humanizeReport: "Input is structured data — no humanization applicable." Return input unchanged.
Mixed human + AI textPartial AI generation or human-edited AI outputRun Phase 1 on full text. Flag only paragraphs/sections with detected patterns. Apply Phases 2-4 selectively to flagged sections. Leave clean sections untouched.
Domain ambiguous after detectionInput mixes registers (e.g., academic citations in a blog post)Default to professional. Note the ambiguity in the output: "Domain defaulted to professional — specify if another profile is preferred."
Semantic drift detected in Phase 5Rewrite altered meaning during structural/vocabulary changesRestore the drifted factual claim from the original. Do not re-run the full pipeline. Note the restoration in the changes summary.
Input contains fabricated citationsOriginal text has hallucinated sourcesNot detectable — this skill humanizes style, not factual accuracy. Pass through unchanged. Note in limitations if the user asks about accuracy.
All patterns are LOW severityText is mostly human-written with minor tellsIn targeted fix mode, report findings but recommend no changes. In full rewrite mode, apply light-touch fixes only — do not over-edit clean text.
问题原因解决方案
输入少于20词没有足够的信号用于模式检测报告:"文本过短,无法进行可靠的模式检测。" 如果存在明显模式仅执行词汇修复(阶段3),跳过统计信号分析。
输入全为代码/结构化数据没有可用于人文化处理的散文内容报告:"输入为结构化数据——不适用人文化处理。" 原样返回输入。
人类+AI混合文本部分AI生成或经过人类编辑的AI输出对全文运行阶段1,仅标记检测到模式的段落/板块,仅对标记部分选择性执行阶段2-4,未发现问题的部分保持原样。
检测后领域仍存在歧义输入混合语体(比如博客文章中出现学术引用)默认使用专业领域,在输出中注明歧义:"领域默认设置为专业——如果需要其他配置请指定。"
阶段5检测到语义偏移结构/词汇变更过程中重写改变了原意从原文中恢复偏移的事实主张,不要重新运行完整流水线,在变更摘要中注明本次恢复。
输入包含虚构引用原文存在幻觉来源无法检测——本技能仅优化写作风格,不校验事实准确性,原样保留即可。如果用户询问准确性,可在局限性中说明。
所有模式都是低严重程度文本大部分为人类写作,仅存在少量特征定向修复模式下,报告发现结果但建议不做修改;完整重写模式下,仅执行轻量修复——不要过度编辑干净的文本。

Integration Point

集成点

Other writing skills can import
references/detection-patterns.md
as a pattern library for their own anti-pattern sweeps. The detection patterns are the shared asset; the pipeline is this skill's domain.
其他写作技能可以导入
references/detection-patterns.md
作为模式库,用于自身的反模式扫描。检测模式是共享资产,本流水线为该技能专属能力。

Limitations

局限性

  • Cannot verify factual accuracy of the original text. Garbage in, humanized garbage out.
  • Effectiveness depends on input length. Very short texts (under 20 words) have insufficient signal for pattern detection.
  • Style profiles are guidelines, not voice cloning. The output will sound natural but will not match a specific author's voice without additional calibration.
  • Does not interact with external AI-detection APIs. Assessment is heuristic, not benchmark-verified.
  • 无法验证原文的事实准确性,输入垃圾内容只会输出人文化的垃圾内容。
  • 效果取决于输入长度,极短文本(20词以下)没有足够的信号用于模式检测。
  • 风格配置仅为指导,不是声音克隆,输出会很自然,但如果没有额外校准无法匹配特定作者的口吻。
  • 不与外部AI检测API交互,评估为启发式判断,未经过基准验证。