blog-audit

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Blog Audit -- Full-Site Health Assessment

博客审计——全站健康评估

Performs a comprehensive blog health assessment across all posts in the project. Scans for quality scores, orphan pages, topic cannibalization, stale content, and AI citation readiness. Uses parallel subagents for efficient analysis and produces a prioritized action queue.
对项目中所有博客文章进行全面的健康评估。扫描内容质量评分、孤立页面、主题内耗、陈旧内容以及AI引用就绪情况。通过并行子代理实现高效分析,并生成优先级行动队列。

Audit Process

审计流程

Step 1: Discover Blog Files

步骤1:发现博客文件

Scan the project for all blog content files:
  • Glob for
    *.md
    ,
    *.mdx
    ,
    *.html
    in common blog directories
  • Common paths to check:
    • content/
    • posts/
    • blog/
    • src/content/
    • _posts/
    • pages/blog/
    • articles/
    • src/pages/blog/
  • Filter out non-blog files: README, CHANGELOG, LICENSE, config files, SKILL.md, package.json, node_modules
  • Report: "Found N blog files in [directories]"
If no blog files are found in standard locations, search the entire project root for markdown files with blog-like frontmatter (title, date, description).
扫描项目中的所有博客内容文件:
  • 匹配常见博客目录下的
    *.md
    *.mdx
    *.html
    文件
  • 需要检查的常见路径:
    • content/
    • posts/
    • blog/
    • src/content/
    • _posts/
    • pages/blog/
    • articles/
    • src/pages/blog/
  • 过滤非博客文件:README、CHANGELOG、LICENSE、配置文件、SKILL.md、package.json、node_modules
  • 报告:“在[目录]中找到N个博客文件”
如果在标准路径中未找到博客文件,则在整个项目根目录中搜索带有类博客前置元数据(标题、日期、描述)的Markdown文件。

Step 2: Parallel Analysis

步骤2:并行分析

Spawn subagents via the Task tool for parallel processing across all discovered blog files:
通过Task工具启动子代理,对所有已发现的博客文件进行并行处理:

Content Quality Agent

内容质量代理

  • Score each post on the 30-point content quality scale
  • Check paragraph length (target 40-80 words, hard limit 150)
  • Check sentence length (target 15-20 words)
  • Evaluate heading structure and question-format headings
  • Assess readability (Flesch Reading Ease 60-70 target)
  • 按照30分制为每篇文章打分
  • 检查段落长度(目标40-80词,上限150词)
  • 检查句子长度(目标15-20词)
  • 评估标题结构和疑问式标题
  • 评估可读性(目标Flesch阅读难度60-70)

SEO Optimization Agent

SEO优化代理

  • Check on-page SEO elements per post:
    • Title tag length (50-60 chars)
    • Meta description (150-160 chars, includes statistic)
    • H1 presence and uniqueness
    • Image alt text coverage
    • Internal and external link counts
    • URL slug quality
  • 检查每篇文章的页面内SEO元素:
    • 标题标签长度(50-60字符)
    • 元描述(150-160字符,包含统计数据)
    • H1标签的存在性和唯一性
    • 图片替代文本覆盖率
    • 内部和外部链接数量
    • URL slug质量

Schema Validation Agent

Schema验证代理

  • Detect structured data across all posts
  • Validate BlogPosting schema completeness
  • Check FAQ schema presence and format
  • Verify dateModified matches lastUpdated frontmatter
  • Flag missing or malformed schema
  • 检测所有文章中的结构化数据
  • 验证BlogPosting schema的完整性
  • 检查FAQ schema的存在性和格式
  • 验证dateModified是否与lastUpdated前置元数据匹配
  • 标记缺失或格式错误的schema

Link Health Agent

链接健康代理

  • Map internal links across all posts
  • Build a directed link graph
  • Detect orphan pages (zero inbound internal links)
  • Detect dead-end pages (zero outbound internal links)
  • Check for broken internal link targets
  • Recommend bidirectional link opportunities
  • 绘制所有文章的内部链接图谱
  • 构建有向链接图
  • 检测孤立页面(无入站内部链接)
  • 检测死端页面(无出站内部链接)
  • 检查无效的内部链接目标
  • 推荐双向链接机会

Freshness Check Agent

新鲜度检查代理

  • Read lastUpdated or dateModified from each post's frontmatter
  • Calculate days since last update
  • Flag posts not updated in 90+ days
  • Categorize by refresh priority
  • 读取每篇文章前置元数据中的lastUpdated或dateModified字段
  • 计算距离上次更新的天数
  • 标记超过90天未更新的文章
  • 按刷新优先级分类

AI Readiness Agent

AI就绪代理

  • Score each post for AI citation readiness
  • Check passage-level citability (120-180 word sections)
  • Evaluate Q&A formatting and entity clarity
  • Check for TL;DR boxes and citation capsules
  • Assess AI crawler accessibility
  • 为每篇文章的AI引用就绪情况打分
  • 检查段落级可引用性(120-180词的段落)
  • 评估问答格式和实体清晰度
  • 检查是否有TL;DR模块和引用胶囊
  • 评估AI爬虫的可访问性

Step 3: Topic Cannibalization Detection

步骤3:主题内耗检测

Analyze across all posts for keyword competition:
  1. Extract primary keyword/topic from each post:
    • Title text
    • H1 heading
    • Meta description
    • First paragraph
  2. Normalize keywords (lowercase, remove stop words)
  3. Detect multiple posts targeting the same primary keyword
  4. Flag competing posts with one of these recommendations:
    • Merge: Combine two weak posts into one strong post
    • Redirect: 301 redirect the weaker post to the stronger one
    • Differentiate: Adjust focus so posts target distinct intents
分析所有文章的关键词竞争情况:
  1. 从每篇文章中提取核心关键词/主题:
    • 标题文本
    • H1标题
    • 元描述
    • 第一段内容
  2. 标准化关键词(转为小写,移除停用词)
  3. 检测多篇文章目标相同核心关键词的情况
  4. 标记存在竞争的文章,并给出以下建议之一:
    • 合并:将两篇质量一般的文章合并为一篇优质文章
    • 重定向:将质量较差的文章301重定向至优质文章
    • 差异化:调整内容焦点,使文章针对不同的用户意图

Step 4: Orphan Page Detection

步骤4:孤立页面检测

Build and analyze the internal link graph:
  1. For each blog post, extract all internal links (relative and absolute)
  2. Build an adjacency map:
    { page -> [pages it links to] }
  3. Build a reverse map:
    { page -> [pages linking to it] }
  4. Identify orphan pages: posts with zero inbound internal links
  5. Identify dead-end pages: posts with zero outbound internal links
  6. For each orphan, recommend 2-3 existing posts that should link to it based on topic relevance
构建并分析内部链接图谱:
  1. 为每篇博客文章提取所有内部链接(相对路径和绝对路径)
  2. 构建邻接映射:
    { 页面 -> [其链接到的页面] }
  3. 构建反向映射:
    { 页面 -> [链接到它的页面] }
  4. 识别孤立页面:无入站内部链接的文章
  5. 识别死端页面:无出站内部链接的文章
  6. 为每个孤立页面推荐2-3篇基于主题相关性的现有文章,建议这些文章链接到它

Step 5: Stale Content Detection

步骤5:陈旧内容检测

Audit content freshness across all posts:
  1. Read frontmatter fields:
    lastUpdated
    ,
    dateModified
    ,
    date
    ,
    updated
  2. Calculate days since last update for each post
  3. Categorize by refresh priority:
    • High (>180 days): Likely outdated, statistics may be stale
    • Medium (90-180 days): Review for accuracy, update statistics
    • Low (<90 days): Recently updated, no immediate action
  4. Estimate refresh effort per post:
    • Light refresh: Update statistics, check links (1-2 hours)
    • Moderate refresh: Rewrite sections, add new data (3-4 hours)
    • Heavy refresh: Full rewrite recommended (5+ hours)
审计所有文章的内容新鲜度:
  1. 读取前置元数据字段:
    lastUpdated
    dateModified
    date
    updated
  2. 计算每篇文章距离上次更新的天数
  3. 按刷新优先级分类:
    • 高优先级(>180天):内容可能已过时,统计数据可能失效
    • 中优先级(90-180天):检查内容准确性,更新统计数据
    • 低优先级(<90天):最近已更新,无需立即操作
  4. 估算每篇文章的刷新工作量:
    • 轻度刷新:更新统计数据,检查链接(1-2小时)
    • 中度刷新:重写部分章节,添加新数据(3-4小时)
    • 重度刷新:建议完全重写(5小时以上)

Step 6: Generate Site-Wide Report

步骤6:生成全站报告

Aggregate all results into a comprehensive report:
将所有结果汇总为一份综合报告:

Summary Dashboard

概览仪表盘

undefined
undefined

Blog Audit Report

博客审计报告

Audit Date: [date] Total Posts: N Average Score: XX/100
审计日期: [日期] 总文章数: N 平均评分: XX/100

Health Overview

健康状况概览

MetricCount
Posts Scoring 90+ (Excellent)N
Posts Scoring 70-89 (Good)N
Posts Scoring 50-69 (Needs Work)N
Posts Scoring <50 (Poor)N
Orphan PagesN
Dead-End PagesN
Cannibalization IssuesN
Stale Content (90+ days)N
undefined
指标数量
评分90+(优秀)的文章N
评分70-89(良好)的文章N
评分50-69(需要改进)的文章N
评分<50(较差)的文章N
孤立页面N
死端页面N
主题内耗问题N
陈旧内容(90天以上未更新)N
undefined

Per-Post Table

单篇文章评分表

undefined
undefined

Per-Post Scores

单篇文章评分

PostScoreContentSEOE-E-A-TTechnicalAI CitationIssues
[filename]XX/100X/25X/20X/20X/15X/20[count]
undefined
文章总分内容质量SEOE-E-A-T技术指标AI引用问题数
[文件名]XX/100X/25X/20X/20X/15X/20[数量]
undefined

Prioritized Action Queue

优先级行动队列

undefined
undefined

Prioritized Action Queue (Lowest Score First)

优先级行动队列(按评分从低到高)

PriorityPostScoreTop IssueRecommended Action
1[file]XX[issue][action]
2[file]XX[issue][action]
undefined
优先级文章评分主要问题建议操作
1[文件]XX[问题][操作]
2[文件]XX[问题][操作]
undefined

Cannibalization Report

主题内耗报告

undefined
undefined

Topic Cannibalization

主题内耗

KeywordCompeting PostsRecommendation
[keyword]post-a.md, post-b.mdMerge / Redirect / Differentiate
undefined
关键词竞争文章建议
[关键词]post-a.md, post-b.md合并 / 重定向 / 差异化
undefined

Orphan Pages

孤立页面

undefined
undefined

Orphan Pages (No Inbound Links)

孤立页面(无入站链接)

PageInbound LinksRecommended Link Sources
[file]0post-a.md, post-b.md, post-c.md
undefined
页面入站链接数建议链接来源
[文件]0post-a.md, post-b.md, post-c.md
undefined

Stale Content

陈旧内容

undefined
undefined

Stale Content

陈旧内容

PostLast UpdatedDays StalePriorityRefresh Effort
[file][date][N]High/Med/LowLight/Moderate/Heavy
undefined
文章上次更新时间未更新天数优先级刷新工作量
[文件][日期][N]高/中/低轻度/中度/重度
undefined

Step 7: Save Report

步骤7:保存报告

Save the complete report to
blog-audit-report.md
in the project root.
After saving, inform the user:
  • Report location:
    [project-root]/blog-audit-report.md
  • Summary of findings (total posts, average score, critical issues count)
  • Suggest running
    /blog analyze <file>
    on the lowest-scoring post first
  • Suggest running
    /blog geo <file>
    for AI citation optimization on key posts
将完整报告保存到项目根目录下的
blog-audit-report.md
文件中。
保存完成后,告知用户:
  • 报告位置:
    [项目根目录]/blog-audit-report.md
  • 发现结果摘要(总文章数、平均评分、严重问题数量)
  • 建议先对评分最低的文章运行
    /blog analyze <file>
    命令
  • 建议对关键文章运行
    /blog geo <file>
    命令以优化AI引用