geo-audit

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

GEO Audit Orchestration Skill

GEO审计编排技能

Purpose

用途

This skill performs a comprehensive Generative Engine Optimization (GEO) audit of any website. GEO is the practice of optimizing web content so that AI systems (ChatGPT, Claude, Perplexity, Gemini, etc.) can discover, understand, cite, and recommend it. This audit measures how well a site performs across all GEO dimensions and produces an actionable improvement plan.

本技能可对任意网站执行全面的生成式引擎优化（GEO）审计。GEO是指优化网站内容，使其能被AI系统（ChatGPT、Claude、Perplexity、Gemini等）发现、理解、引用和推荐的实践方法。本次审计会评估网站在所有GEO维度上的表现，并生成可落地的改进计划。

Key Insight

核心洞察

Traditional SEO optimizes for search engine rankings. GEO optimizes for AI citation and recommendation. Sites that score high on GEO metrics see 30-115% more visibility in AI-generated responses (Georgia Tech / Princeton / IIT Delhi 2024 study). The two disciplines overlap but have distinct requirements.

传统SEO针对搜索引擎排名进行优化，而GEO则针对AI的引用和推荐进行优化。根据佐治亚理工学院/普林斯顿大学/印度理工学院德里分校2024年的研究，GEO指标得分高的网站在AI生成的响应中可见度提升30-115%。这两个领域虽有重叠，但有着不同的要求。

Audit Workflow

审计流程

Phase 1: Discovery and Reconnaissance

第一阶段：发现与侦察

Step 1: Fetch Homepage and Detect Business Type

Use WebFetch to retrieve the homepage at the provided URL.
Extract the following signals:
- Page title, meta description, H1 heading
- Navigation menu items (reveals site structure)
- Footer content (reveals business info, location, legal pages)
- Schema.org markup on homepage (Organization, LocalBusiness, etc.)
- Pricing page link (SaaS indicator)
- Product listing patterns (E-commerce indicator)
- Blog/resource section (Publisher indicator)
- Service pages (Agency indicator)
- Address/phone/Google Maps embed (Local business indicator)
Classify the business type using these patterns:

Business Type	Detection Signals
SaaS	Pricing page, "Sign up" / "Free trial" CTAs, app.domain.com subdomain, feature comparison tables, integration pages
Local Business	Physical address on homepage, Google Maps embed, "Near me" content, LocalBusiness schema, service area pages
E-commerce	Product listings, shopping cart, product schema, category pages, price displays, "Add to cart" buttons
Publisher	Blog-heavy navigation, article schema, author pages, date-based archives, RSS feeds, high content volume
Agency/Services	Case studies, portfolio, "Our Work" section, team page, client logos, service descriptions
Hybrid	Combination of above signals -- classify by dominant pattern

Step 2: Crawl Sitemap and Internal Links

Attempt to fetch
```
/sitemap.xml
```
and
```
/sitemap_index.xml
```
.
If sitemap exists, extract up to 50 unique page URLs prioritized by:
- Homepage (always include)
- Top-level navigation pages
- High-value pages (pricing, about, contact, key service/product pages)
- Blog posts (sample 5-10 most recent)
- Category/landing pages
If no sitemap exists, crawl internal links from the homepage:
- Extract all
```
<a href>
```
  links pointing to the same domain
- Follow up to 2 levels deep
- Prioritize pages linked from main navigation
Respect
```
robots.txt
```
directives -- do not fetch disallowed paths.
Enforce a maximum of 50 pages and a 30-second timeout per fetch.

Step 3: Collect Page-Level Data

For each page in the crawl set, record:

URL, title, meta description, canonical URL
H1-H6 heading structure
Word count of main content
Schema.org types present
Internal/external link counts
Images with/without alt text
Open Graph and Twitter Card meta tags
Response status code
Whether the page has structured data

步骤1：抓取首页并识别业务类型

使用WebFetch工具获取指定URL的首页内容。
提取以下信号：
- 页面标题、元描述、H1标题
- 导航菜单选项（揭示网站结构）
- 页脚内容（揭示业务信息、位置、法律页面）
- 首页上的Schema.org标记（Organization、LocalBusiness等）
- 定价页面链接（SaaS网站的标识）
- 产品列表模式（电商网站的标识）
- 博客/资源板块（内容发布平台的标识）
- 服务页面（代理/服务类网站的标识）
- 地址/电话/Google Maps嵌入（本地商家的标识）
根据以下模式对业务类型进行分类：

业务类型	识别信号
SaaS	定价页面、"注册"/"免费试用"CTA、app.domain.com子域名、功能对比表、集成页面
本地商家	首页显示实体地址、Google Maps嵌入、"附近"相关内容、LocalBusiness Schema、服务区域页面
电商网站	产品列表、购物车、产品Schema、分类页面、价格展示、"加入购物车"按钮
内容发布平台	以博客为主的导航、文章Schema、作者页面、按日期归档、RSS订阅、高内容量
代理/服务类网站	案例研究、作品集、"我们的作品"板块、团队页面、客户logo、服务描述
混合型	以上多种信号的组合——按主导模式分类

步骤2：抓取站点地图与内部链接

尝试获取
```
/sitemap.xml
```
和
```
/sitemap_index.xml
```
。
如果站点地图存在，提取最多50个唯一页面URL，优先级如下：
- 首页（始终包含）
- 顶级导航页面
- 高价值页面（定价、关于我们、联系我们、核心服务/产品页面）
- 博客文章（抽取5-10篇最新的）
- 分类/落地页
如果没有站点地图，则从首页抓取内部链接：
- 提取所有指向同一域名的
```
<a href>
```
  链接
- 最多跟进2层深度
- 优先处理主导航中的链接
遵守
```
robots.txt
```
指令——不抓取被禁止的路径。
强制限制最多抓取50个页面，每个页面抓取超时时间为30秒。

步骤3：收集页面级数据

针对抓取集中的每个页面，记录以下信息：

URL、标题、元描述、规范URL
H1-H6标题结构
主内容的字数
存在的Schema.org类型
内部/外部链接数量
有无替代文本的图片
Open Graph和Twitter Card元标签
响应状态码
页面是否包含结构化数据

Phase 2: Parallel Subagent Delegation

第二阶段：并行子代理任务分配

Delegate analysis to 5 specialized subagents. Each subagent operates on the collected page data and produces a category score (0-100) plus findings.

Subagent 1: AI Citability Analysis (geo-citability)

Analyze content blocks for quotability by AI systems
Score passage self-containment, answer block quality, statistical density
Identify high-value pages that could be reformatted for better AI citation

Subagent 2: Platform & Brand Analysis (geo-brand-mentions)

Check brand presence across YouTube, Reddit, Wikipedia, LinkedIn
Assess third-party mention volume and sentiment
Score brand authority signals that AI models use for entity recognition

Subagent 3: Technical GEO Infrastructure (geo-crawlers + geo-llmstxt)

Analyze robots.txt for AI crawler access
Check for llms.txt presence and quality
Verify meta tags, headers, and technical accessibility for AI systems
Check page speed and rendering (JS-heavy sites are harder for AI crawlers)

Subagent 4: Content E-E-A-T Quality (geo-content)

Evaluate Experience, Expertise, Authoritativeness, Trustworthiness signals
Check author bios, credentials, source citations
Assess content freshness, depth, and originality
Verify "About" page quality and team credentials

Subagent 5: Schema & Structured Data (geo-schema)

Validate all schema.org markup
Check for GEO-critical schema types (FAQ, HowTo, Organization, Product, Article)
Assess schema completeness and accuracy
Identify missing schema opportunities

将分析任务委托给5个专业子代理。每个子代理基于收集到的页面数据，生成对应维度的评分（0-100分）及发现的问题。

子代理1：AI可引用性分析（geo-citability）

分析内容块对AI系统的可引用性
对段落的自包含性、问答块质量、信息密度进行评分
识别可通过重新格式化提升AI可引用性的高价值页面

子代理2：平台与品牌分析（geo-brand-mentions）

检查品牌在YouTube、Reddit、Wikipedia、LinkedIn上的存在感
评估第三方提及的数量和情感倾向
对AI模型用于实体识别的品牌权威信号进行评分

子代理3：GEO技术基础设施分析（geo-crawlers + geo-llmstxt）

分析
```
robots.txt
```
中针对AI爬虫的访问规则
检查
```
llms.txt
```
文件是否存在及质量
验证元标签、响应头和AI系统可访问的技术配置
检查页面加载速度和渲染情况（纯JS渲染的网站对AI爬虫更不友好）

子代理4：内容E-E-A-T质量分析（geo-content）

评估Experience（经验）、Expertise（专业度）、Authoritativeness（权威性）、Trustworthiness（可信度）信号
检查作者简介、资质、来源引用情况
评估内容的新鲜度、深度和原创性
验证“关于我们”页面的质量和团队资质

子代理5：Schema与结构化数据分析（geo-schema）

验证所有Schema.org标记
检查GEO关键Schema类型（FAQ、HowTo、Organization、Product、Article）
评估Schema的完整性和准确性
识别缺失的Schema优化机会

Phase 3: Score Aggregation and Report Generation

第三阶段：评分汇总与报告生成

Composite GEO Score Calculation

综合GEO评分计算

The overall GEO Score (0-100) is a weighted average of six category scores:

Category	Weight	What It Measures
AI Citability	25%	How quotable/extractable content is for AI systems
Brand Authority	20%	Third-party mentions, entity recognition signals
Content E-E-A-T	20%	Experience, Expertise, Authoritativeness, Trustworthiness
Technical GEO	15%	AI crawler access, llms.txt, rendering, speed
Schema & Structured Data	10%	Schema.org markup quality and completeness
Platform Optimization	10%	Presence on platforms AI models train on and cite

Formula:

GEO_Score = (Citability * 0.25) + (Brand * 0.20) + (EEAT * 0.20) + (Technical * 0.15) + (Schema * 0.10) + (Platform * 0.10)

整体GEO评分（0-100分）是六个维度评分的加权平均值：

维度	权重	衡量内容
AI可引用性	25%	内容对AI系统的可引用/可提取程度
品牌权威性	20%	第三方提及、实体识别信号
内容E-E-A-T	20%	经验、专业度、权威性、可信度
GEO技术配置	15%	AI爬虫访问权限、llms.txt、页面渲染、加载速度
Schema与结构化数据	10%	Schema.org标记的质量和完整性
平台优化	10%	在AI模型训练和引用的平台上的存在感

计算公式：

GEO_Score = (Citability * 0.25) + (Brand * 0.20) + (EEAT * 0.20) + (Technical * 0.15) + (Schema * 0.10) + (Platform * 0.10)

Score Interpretation

评分解读

Score Range	Rating	Interpretation
90-100	Excellent	Top-tier GEO optimization; site is highly likely to be cited by AI
75-89	Good	Strong GEO foundation with room for improvement
60-74	Fair	Moderate GEO presence; significant optimization opportunities exist
40-59	Poor	Weak GEO signals; AI systems may struggle to cite or recommend
0-39	Critical	Minimal GEO optimization; site is largely invisible to AI systems

评分范围	评级	解读
90-100	优秀	顶级GEO优化；网站极有可能被AI引用
75-89	良好	坚实的GEO基础，仍有提升空间
60-74	一般	中等GEO表现；存在显著的优化机会
40-59	较差	GEO信号薄弱；AI系统可能难以引用或推荐
0-39	极差	几乎没有GEO优化；网站在AI系统中基本不可见

Issue Severity Classification

问题严重程度分类

Every issue found during the audit is classified by severity:

审计中发现的所有问题均按严重程度分类：

Critical (Fix Immediately)

严重问题（立即修复）

All AI crawlers blocked in robots.txt
No indexable content (JavaScript-rendered only with no SSR)
Domain-level noindex directive
Site returns 5xx errors on key pages
Complete absence of any structured data
Brand not recognized as an entity by any AI system

```
robots.txt
```
中禁止所有AI爬虫访问
无可索引内容（仅JS渲染且无SSR）
域名级别的noindex指令
核心页面返回5xx错误
完全没有任何结构化数据
任何AI系统都无法将该品牌识别为实体

High (Fix Within 1 Week)

高优先级问题（1周内修复）

Key AI crawlers (GPTBot, ClaudeBot, PerplexityBot) blocked
No llms.txt file present
Zero question-answering content blocks on key pages
Missing Organization or LocalBusiness schema
No author attribution on content pages
All content behind login/paywall with no preview

关键AI爬虫（GPTBot、ClaudeBot、PerplexityBot）被禁止访问
不存在
```
llms.txt
```
文件
核心页面上没有任何问答式内容块
缺失Organization或LocalBusiness Schema
内容页面没有作者署名
所有内容都需要登录/付费才能查看，无预览

Medium (Fix Within 1 Month)

中优先级问题（1个月内修复）

Partial AI crawler blocking (some allowed, some blocked)
llms.txt exists but is incomplete or malformed
Content blocks average under 50 citability score
Missing FAQ schema on pages with FAQ content
Thin author bios without credentials
No Wikipedia or Reddit brand presence

部分AI爬虫被禁止（部分允许，部分禁止）
```
llms.txt
```
存在但不完整或格式错误
内容块的可引用性评分平均低于50分
包含FAQ内容的页面缺失FAQ Schema
作者简介单薄，无资质说明
在Wikipedia或Reddit上无品牌存在感

Low (Optimize When Possible)

低优先级问题（尽可能优化）

Minor schema validation errors
Some images missing alt text
Content freshness issues on non-critical pages
Missing Open Graph tags
Suboptimal heading hierarchy on some pages
LinkedIn company page exists but is incomplete

轻微的Schema验证错误
部分图片缺失替代文本
非核心页面的内容新鲜度问题
缺失Open Graph标签
部分页面的标题层级不合理
LinkedIn企业主页存在但信息不完整

Output Format

输出格式

Generate a file called

GEO-AUDIT-REPORT.md

with the following structure:

markdown

undefined

生成名为

GEO-AUDIT-REPORT.md

的文件，结构如下：

markdown

undefined

GEO Audit Report: [Site Name]

GEO审计报告：[网站名称]

Audit Date: [Date] URL: [URL] Business Type: [Detected Type] Pages Analyzed: [Count]

审计日期： [日期] URL： [网址] 业务类型： [识别出的类型] 分析页面数量： [数量]

Executive Summary

执行摘要

Overall GEO Score: [X]/100 ([Rating])

[2-3 sentence summary of the site's GEO health, biggest strengths, and most critical gaps.]

整体GEO评分：[X]/100（[评级]）

[2-3句话总结网站的GEO健康状况、最大优势和最关键的差距。]

Score Breakdown

评分细分

Category	Score	Weight	Weighted Score
AI Citability	[X]/100	25%	[X]
Brand Authority	[X]/100	20%	[X]
Content E-E-A-T	[X]/100	20%	[X]
Technical GEO	[X]/100	15%	[X]
Schema & Structured Data	[X]/100	10%	[X]
Platform Optimization	[X]/100	10%	[X]
Overall GEO Score			[X]/100

维度	得分	权重	加权得分
AI可引用性	[X]/100	25%	[X]
品牌权威性	[X]/100	20%	[X]
内容E-E-A-T	[X]/100	20%	[X]
GEO技术配置	[X]/100	15%	[X]
Schema与结构化数据	[X]/100	10%	[X]
平台优化	[X]/100	10%	[X]
整体GEO评分			[X]/100

Critical Issues (Fix Immediately)

严重问题（立即修复）

[List each critical issue with specific page URLs and recommended fix]

[列出每个严重问题，包含具体页面URL和建议修复方案]

High Priority Issues

高优先级问题

[List each high-priority issue with details]

[列出每个高优先级问题及详细信息]

Medium Priority Issues

中优先级问题

[List each medium-priority issue]

[列出每个中优先级问题]

Low Priority Issues

低优先级问题

[List each low-priority issue]

[列出每个低优先级问题]

Category Deep Dives

维度深度分析

AI Citability ([X]/100)

AI可引用性（[X]/100）

[Detailed findings, examples of good/bad passages, rewrite suggestions]

[详细发现、优质/劣质段落示例、重写建议]

Brand Authority ([X]/100)

品牌权威性（[X]/100）

[Platform presence map, mention volume, sentiment]

[平台存在感图谱、提及数量、情感倾向]

Content E-E-A-T ([X]/100)

内容E-E-A-T（[X]/100）

[Author quality, source citations, freshness, depth]

[作者质量、来源引用、新鲜度、深度]

Technical GEO ([X]/100)

GEO技术配置（[X]/100）

[Crawler access, llms.txt, rendering, headers]

[爬虫访问权限、llms.txt、页面渲染、响应头]

Schema & Structured Data ([X]/100)

Schema与结构化数据（[X]/100）

[Schema types found, validation results, missing opportunities]

[已发现的Schema类型、验证结果、缺失的优化机会]

Platform Optimization ([X]/100)

平台优化（[X]/100）

[Presence on YouTube, Reddit, Wikipedia, etc.]

[在YouTube、Reddit、Wikipedia等平台的存在感]

Quick Wins (Implement This Week)

快速优化项（本周实施）

[Specific, actionable quick win with expected impact]
[Another quick win]
[Another quick win]
[Another quick win]
[Another quick win]

[具体、可落地的快速优化项及预期效果]
[另一项快速优化项]
[另一项快速优化项]
[另一项快速优化项]
[另一项快速优化项]

30-Day Action Plan

30天行动计划

Week 1: [Theme]

第1周：[主题]

Action item 1
Action item 2

行动项1
行动项2

Week 2: [Theme]

第2周：[主题]

Action item 1
Action item 2

行动项1
行动项2

Week 3: [Theme]

第3周：[主题]

Action item 1
Action item 2

行动项1
行动项2

Week 4: [Theme]

第4周：[主题]

Action item 1
Action item 2

行动项1
行动项2

Appendix: Pages Analyzed

附录：分析的页面列表

URL	Title	GEO Issues
[url]	[title]	[issue count]

---

URL	标题	GEO问题数量
[url]	[title]	[issue count]

---

Quality Gates

质量管控规则

Page Limit: Never crawl more than 50 pages per audit. Prioritize high-value pages.
Timeout: 30-second maximum per page fetch. Skip pages that exceed this.
Robots.txt: Always check and respect robots.txt before crawling. Note any AI-specific directives.
Rate Limiting: Wait at least 1 second between page fetches to avoid overloading the server.
Error Handling: Log failed fetches but continue the audit. Report fetch failures in the appendix.
Content Type: Only analyze HTML pages. Skip PDFs, images, and other binary content.
Deduplication: Canonicalize URLs before crawling. Skip duplicate content (e.g., HTTP vs HTTPS, www vs non-www, trailing slashes).

页面限制： 每次审计抓取的页面不得超过50个，优先处理高价值页面。
超时设置： 每个页面抓取的最长超时时间为30秒，跳过超时页面。
Robots.txt规则： 抓取前务必检查并遵守
```
robots.txt
```
，记录任何针对AI的特殊指令。
速率限制： 页面抓取间隔至少1秒，避免给服务器造成过载。
错误处理： 记录抓取失败的页面，但继续执行审计，并在附录中报告抓取失败情况。
内容类型： 仅分析HTML页面，跳过PDF、图片及其他二进制内容。
去重处理： 抓取前对URL进行规范化处理，跳过重复内容（如HTTP与HTTPS、www与非www、末尾斜杠差异）。

Business-Type-Specific Audit Adjustments

针对不同业务类型的审计调整

SaaS Sites

SaaS网站

Extra weight on: Feature comparison tables (high citability), integration pages, documentation quality
Check for: API documentation structure, changelog pages, knowledge base organization
Key schema: SoftwareApplication, FAQPage, HowTo

额外关注：功能对比表（高可引用性）、集成页面、文档质量
检查内容：API文档结构、更新日志页面、知识库组织
关键Schema：SoftwareApplication、FAQPage、HowTo

Local Businesses

本地商家

Extra weight on: NAP consistency, Google Business Profile signals, local schema
Check for: Service area pages, location-specific content, review markup
Key schema: LocalBusiness, GeoCoordinates, OpeningHoursSpecification

额外关注：NAP一致性、Google商家资料信号、本地Schema
检查内容：服务区域页面、位置相关内容、评价标记
关键Schema：LocalBusiness、GeoCoordinates、OpeningHoursSpecification

E-commerce Sites

电商网站

Extra weight on: Product descriptions (citability), comparison content, buying guides
Check for: Product schema completeness, review aggregation, FAQ sections on product pages
Key schema: Product, AggregateRating, Offer, BreadcrumbList

额外关注：产品描述（可引用性）、对比内容、购买指南
检查内容：产品Schema完整性、评价汇总、产品页面的FAQ板块
关键Schema：Product、AggregateRating、Offer、BreadcrumbList

Publishers

内容发布平台

Extra weight on: Article quality, author credentials, source citation practices
Check for: Article schema, author pages, publication date freshness, original research
Key schema: Article, NewsArticle, Person (author), ClaimReview

额外关注：文章质量、作者资质、来源引用规范
检查内容：Article Schema、作者页面、发布日期新鲜度、原创研究
关键Schema：Article、NewsArticle、Person（作者）、ClaimReview

Agency/Services

代理/服务类网站

Extra weight on: Case studies (citability), expertise demonstration, thought leadership
Check for: Portfolio schema, team credentials, industry-specific expertise signals
Key schema: Organization, Service, Person (team), Review

额外关注：案例研究（可引用性）、专业能力展示、思想领导力内容
检查内容：作品集Schema、团队资质、行业特定专业信号
关键Schema：Organization、Service、Person（团队成员）、Review