geo-audit
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGEO Audit Orchestration Skill
GEO审计编排技能
Purpose
用途
This skill performs a comprehensive Generative Engine Optimization (GEO) audit of any website. GEO is the practice of optimizing web content so that AI systems (ChatGPT, Claude, Perplexity, Gemini, etc.) can discover, understand, cite, and recommend it. This audit measures how well a site performs across all GEO dimensions and produces an actionable improvement plan.
本技能可对任意网站执行全面的生成式引擎优化(GEO)审计。GEO是指优化网站内容,使其能被AI系统(ChatGPT、Claude、Perplexity、Gemini等)发现、理解、引用和推荐的实践方法。本次审计会评估网站在所有GEO维度上的表现,并生成可落地的改进计划。
Key Insight
核心洞察
Traditional SEO optimizes for search engine rankings. GEO optimizes for AI citation and recommendation. Sites that score high on GEO metrics see 30-115% more visibility in AI-generated responses (Georgia Tech / Princeton / IIT Delhi 2024 study). The two disciplines overlap but have distinct requirements.
传统SEO针对搜索引擎排名进行优化,而GEO则针对AI的引用和推荐进行优化。根据佐治亚理工学院/普林斯顿大学/印度理工学院德里分校2024年的研究,GEO指标得分高的网站在AI生成的响应中可见度提升30-115%。这两个领域虽有重叠,但有着不同的要求。
Audit Workflow
审计流程
Phase 1: Discovery and Reconnaissance
第一阶段:发现与侦察
Step 1: Fetch Homepage and Detect Business Type
-
Use WebFetch to retrieve the homepage at the provided URL.
-
Extract the following signals:
- Page title, meta description, H1 heading
- Navigation menu items (reveals site structure)
- Footer content (reveals business info, location, legal pages)
- Schema.org markup on homepage (Organization, LocalBusiness, etc.)
- Pricing page link (SaaS indicator)
- Product listing patterns (E-commerce indicator)
- Blog/resource section (Publisher indicator)
- Service pages (Agency indicator)
- Address/phone/Google Maps embed (Local business indicator)
-
Classify the business type using these patterns:
| Business Type | Detection Signals |
|---|---|
| SaaS | Pricing page, "Sign up" / "Free trial" CTAs, app.domain.com subdomain, feature comparison tables, integration pages |
| Local Business | Physical address on homepage, Google Maps embed, "Near me" content, LocalBusiness schema, service area pages |
| E-commerce | Product listings, shopping cart, product schema, category pages, price displays, "Add to cart" buttons |
| Publisher | Blog-heavy navigation, article schema, author pages, date-based archives, RSS feeds, high content volume |
| Agency/Services | Case studies, portfolio, "Our Work" section, team page, client logos, service descriptions |
| Hybrid | Combination of above signals -- classify by dominant pattern |
Step 2: Crawl Sitemap and Internal Links
- Attempt to fetch and
/sitemap.xml./sitemap_index.xml - If sitemap exists, extract up to 50 unique page URLs prioritized by:
- Homepage (always include)
- Top-level navigation pages
- High-value pages (pricing, about, contact, key service/product pages)
- Blog posts (sample 5-10 most recent)
- Category/landing pages
- If no sitemap exists, crawl internal links from the homepage:
- Extract all links pointing to the same domain
<a href> - Follow up to 2 levels deep
- Prioritize pages linked from main navigation
- Extract all
- Respect directives -- do not fetch disallowed paths.
robots.txt - Enforce a maximum of 50 pages and a 30-second timeout per fetch.
Step 3: Collect Page-Level Data
For each page in the crawl set, record:
- URL, title, meta description, canonical URL
- H1-H6 heading structure
- Word count of main content
- Schema.org types present
- Internal/external link counts
- Images with/without alt text
- Open Graph and Twitter Card meta tags
- Response status code
- Whether the page has structured data
步骤1:抓取首页并识别业务类型
-
使用WebFetch工具获取指定URL的首页内容。
-
提取以下信号:
- 页面标题、元描述、H1标题
- 导航菜单选项(揭示网站结构)
- 页脚内容(揭示业务信息、位置、法律页面)
- 首页上的Schema.org标记(Organization、LocalBusiness等)
- 定价页面链接(SaaS网站的标识)
- 产品列表模式(电商网站的标识)
- 博客/资源板块(内容发布平台的标识)
- 服务页面(代理/服务类网站的标识)
- 地址/电话/Google Maps嵌入(本地商家的标识)
-
根据以下模式对业务类型进行分类:
| 业务类型 | 识别信号 |
|---|---|
| SaaS | 定价页面、"注册"/"免费试用"CTA、app.domain.com子域名、功能对比表、集成页面 |
| 本地商家 | 首页显示实体地址、Google Maps嵌入、"附近"相关内容、LocalBusiness Schema、服务区域页面 |
| 电商网站 | 产品列表、购物车、产品Schema、分类页面、价格展示、"加入购物车"按钮 |
| 内容发布平台 | 以博客为主的导航、文章Schema、作者页面、按日期归档、RSS订阅、高内容量 |
| 代理/服务类网站 | 案例研究、作品集、"我们的作品"板块、团队页面、客户logo、服务描述 |
| 混合型 | 以上多种信号的组合——按主导模式分类 |
步骤2:抓取站点地图与内部链接
- 尝试获取和
/sitemap.xml。/sitemap_index.xml - 如果站点地图存在,提取最多50个唯一页面URL,优先级如下:
- 首页(始终包含)
- 顶级导航页面
- 高价值页面(定价、关于我们、联系我们、核心服务/产品页面)
- 博客文章(抽取5-10篇最新的)
- 分类/落地页
- 如果没有站点地图,则从首页抓取内部链接:
- 提取所有指向同一域名的链接
<a href> - 最多跟进2层深度
- 优先处理主导航中的链接
- 提取所有指向同一域名的
- 遵守指令——不抓取被禁止的路径。
robots.txt - 强制限制最多抓取50个页面,每个页面抓取超时时间为30秒。
步骤3:收集页面级数据
针对抓取集中的每个页面,记录以下信息:
- URL、标题、元描述、规范URL
- H1-H6标题结构
- 主内容的字数
- 存在的Schema.org类型
- 内部/外部链接数量
- 有无替代文本的图片
- Open Graph和Twitter Card元标签
- 响应状态码
- 页面是否包含结构化数据
Phase 2: Parallel Subagent Delegation
第二阶段:并行子代理任务分配
Delegate analysis to 5 specialized subagents. Each subagent operates on the collected page data and produces a category score (0-100) plus findings.
Subagent 1: AI Citability Analysis (geo-citability)
- Analyze content blocks for quotability by AI systems
- Score passage self-containment, answer block quality, statistical density
- Identify high-value pages that could be reformatted for better AI citation
Subagent 2: Platform & Brand Analysis (geo-brand-mentions)
- Check brand presence across YouTube, Reddit, Wikipedia, LinkedIn
- Assess third-party mention volume and sentiment
- Score brand authority signals that AI models use for entity recognition
Subagent 3: Technical GEO Infrastructure (geo-crawlers + geo-llmstxt)
- Analyze robots.txt for AI crawler access
- Check for llms.txt presence and quality
- Verify meta tags, headers, and technical accessibility for AI systems
- Check page speed and rendering (JS-heavy sites are harder for AI crawlers)
Subagent 4: Content E-E-A-T Quality (geo-content)
- Evaluate Experience, Expertise, Authoritativeness, Trustworthiness signals
- Check author bios, credentials, source citations
- Assess content freshness, depth, and originality
- Verify "About" page quality and team credentials
Subagent 5: Schema & Structured Data (geo-schema)
- Validate all schema.org markup
- Check for GEO-critical schema types (FAQ, HowTo, Organization, Product, Article)
- Assess schema completeness and accuracy
- Identify missing schema opportunities
将分析任务委托给5个专业子代理。每个子代理基于收集到的页面数据,生成对应维度的评分(0-100分)及发现的问题。
子代理1:AI可引用性分析(geo-citability)
- 分析内容块对AI系统的可引用性
- 对段落的自包含性、问答块质量、信息密度进行评分
- 识别可通过重新格式化提升AI可引用性的高价值页面
子代理2:平台与品牌分析(geo-brand-mentions)
- 检查品牌在YouTube、Reddit、Wikipedia、LinkedIn上的存在感
- 评估第三方提及的数量和情感倾向
- 对AI模型用于实体识别的品牌权威信号进行评分
子代理3:GEO技术基础设施分析(geo-crawlers + geo-llmstxt)
- 分析中针对AI爬虫的访问规则
robots.txt - 检查文件是否存在及质量
llms.txt - 验证元标签、响应头和AI系统可访问的技术配置
- 检查页面加载速度和渲染情况(纯JS渲染的网站对AI爬虫更不友好)
子代理4:内容E-E-A-T质量分析(geo-content)
- 评估Experience(经验)、Expertise(专业度)、Authoritativeness(权威性)、Trustworthiness(可信度)信号
- 检查作者简介、资质、来源引用情况
- 评估内容的新鲜度、深度和原创性
- 验证“关于我们”页面的质量和团队资质
子代理5:Schema与结构化数据分析(geo-schema)
- 验证所有Schema.org标记
- 检查GEO关键Schema类型(FAQ、HowTo、Organization、Product、Article)
- 评估Schema的完整性和准确性
- 识别缺失的Schema优化机会
Phase 3: Score Aggregation and Report Generation
第三阶段:评分汇总与报告生成
Composite GEO Score Calculation
综合GEO评分计算
The overall GEO Score (0-100) is a weighted average of six category scores:
| Category | Weight | What It Measures |
|---|---|---|
| AI Citability | 25% | How quotable/extractable content is for AI systems |
| Brand Authority | 20% | Third-party mentions, entity recognition signals |
| Content E-E-A-T | 20% | Experience, Expertise, Authoritativeness, Trustworthiness |
| Technical GEO | 15% | AI crawler access, llms.txt, rendering, speed |
| Schema & Structured Data | 10% | Schema.org markup quality and completeness |
| Platform Optimization | 10% | Presence on platforms AI models train on and cite |
Formula:
GEO_Score = (Citability * 0.25) + (Brand * 0.20) + (EEAT * 0.20) + (Technical * 0.15) + (Schema * 0.10) + (Platform * 0.10)整体GEO评分(0-100分)是六个维度评分的加权平均值:
| 维度 | 权重 | 衡量内容 |
|---|---|---|
| AI可引用性 | 25% | 内容对AI系统的可引用/可提取程度 |
| 品牌权威性 | 20% | 第三方提及、实体识别信号 |
| 内容E-E-A-T | 20% | 经验、专业度、权威性、可信度 |
| GEO技术配置 | 15% | AI爬虫访问权限、llms.txt、页面渲染、加载速度 |
| Schema与结构化数据 | 10% | Schema.org标记的质量和完整性 |
| 平台优化 | 10% | 在AI模型训练和引用的平台上的存在感 |
计算公式:
GEO_Score = (Citability * 0.25) + (Brand * 0.20) + (EEAT * 0.20) + (Technical * 0.15) + (Schema * 0.10) + (Platform * 0.10)Score Interpretation
评分解读
| Score Range | Rating | Interpretation |
|---|---|---|
| 90-100 | Excellent | Top-tier GEO optimization; site is highly likely to be cited by AI |
| 75-89 | Good | Strong GEO foundation with room for improvement |
| 60-74 | Fair | Moderate GEO presence; significant optimization opportunities exist |
| 40-59 | Poor | Weak GEO signals; AI systems may struggle to cite or recommend |
| 0-39 | Critical | Minimal GEO optimization; site is largely invisible to AI systems |
| 评分范围 | 评级 | 解读 |
|---|---|---|
| 90-100 | 优秀 | 顶级GEO优化;网站极有可能被AI引用 |
| 75-89 | 良好 | 坚实的GEO基础,仍有提升空间 |
| 60-74 | 一般 | 中等GEO表现;存在显著的优化机会 |
| 40-59 | 较差 | GEO信号薄弱;AI系统可能难以引用或推荐 |
| 0-39 | 极差 | 几乎没有GEO优化;网站在AI系统中基本不可见 |
Issue Severity Classification
问题严重程度分类
Every issue found during the audit is classified by severity:
审计中发现的所有问题均按严重程度分类:
Critical (Fix Immediately)
严重问题(立即修复)
- All AI crawlers blocked in robots.txt
- No indexable content (JavaScript-rendered only with no SSR)
- Domain-level noindex directive
- Site returns 5xx errors on key pages
- Complete absence of any structured data
- Brand not recognized as an entity by any AI system
- 中禁止所有AI爬虫访问
robots.txt - 无可索引内容(仅JS渲染且无SSR)
- 域名级别的noindex指令
- 核心页面返回5xx错误
- 完全没有任何结构化数据
- 任何AI系统都无法将该品牌识别为实体
High (Fix Within 1 Week)
高优先级问题(1周内修复)
- Key AI crawlers (GPTBot, ClaudeBot, PerplexityBot) blocked
- No llms.txt file present
- Zero question-answering content blocks on key pages
- Missing Organization or LocalBusiness schema
- No author attribution on content pages
- All content behind login/paywall with no preview
- 关键AI爬虫(GPTBot、ClaudeBot、PerplexityBot)被禁止访问
- 不存在文件
llms.txt - 核心页面上没有任何问答式内容块
- 缺失Organization或LocalBusiness Schema
- 内容页面没有作者署名
- 所有内容都需要登录/付费才能查看,无预览
Medium (Fix Within 1 Month)
中优先级问题(1个月内修复)
- Partial AI crawler blocking (some allowed, some blocked)
- llms.txt exists but is incomplete or malformed
- Content blocks average under 50 citability score
- Missing FAQ schema on pages with FAQ content
- Thin author bios without credentials
- No Wikipedia or Reddit brand presence
- 部分AI爬虫被禁止(部分允许,部分禁止)
- 存在但不完整或格式错误
llms.txt - 内容块的可引用性评分平均低于50分
- 包含FAQ内容的页面缺失FAQ Schema
- 作者简介单薄,无资质说明
- 在Wikipedia或Reddit上无品牌存在感
Low (Optimize When Possible)
低优先级问题(尽可能优化)
- Minor schema validation errors
- Some images missing alt text
- Content freshness issues on non-critical pages
- Missing Open Graph tags
- Suboptimal heading hierarchy on some pages
- LinkedIn company page exists but is incomplete
- 轻微的Schema验证错误
- 部分图片缺失替代文本
- 非核心页面的内容新鲜度问题
- 缺失Open Graph标签
- 部分页面的标题层级不合理
- LinkedIn企业主页存在但信息不完整
Output Format
输出格式
Generate a file called with the following structure:
GEO-AUDIT-REPORT.mdmarkdown
undefined生成名为的文件,结构如下:
GEO-AUDIT-REPORT.mdmarkdown
undefinedGEO Audit Report: [Site Name]
GEO审计报告:[网站名称]
Audit Date: [Date]
URL: [URL]
Business Type: [Detected Type]
Pages Analyzed: [Count]
审计日期: [日期]
URL: [网址]
业务类型: [识别出的类型]
分析页面数量: [数量]
Executive Summary
执行摘要
Overall GEO Score: [X]/100 ([Rating])
[2-3 sentence summary of the site's GEO health, biggest strengths, and most critical gaps.]
整体GEO评分:[X]/100([评级])
[2-3句话总结网站的GEO健康状况、最大优势和最关键的差距。]
Score Breakdown
评分细分
| Category | Score | Weight | Weighted Score |
|---|---|---|---|
| AI Citability | [X]/100 | 25% | [X] |
| Brand Authority | [X]/100 | 20% | [X] |
| Content E-E-A-T | [X]/100 | 20% | [X] |
| Technical GEO | [X]/100 | 15% | [X] |
| Schema & Structured Data | [X]/100 | 10% | [X] |
| Platform Optimization | [X]/100 | 10% | [X] |
| Overall GEO Score | [X]/100 |
| 维度 | 得分 | 权重 | 加权得分 |
|---|---|---|---|
| AI可引用性 | [X]/100 | 25% | [X] |
| 品牌权威性 | [X]/100 | 20% | [X] |
| 内容E-E-A-T | [X]/100 | 20% | [X] |
| GEO技术配置 | [X]/100 | 15% | [X] |
| Schema与结构化数据 | [X]/100 | 10% | [X] |
| 平台优化 | [X]/100 | 10% | [X] |
| 整体GEO评分 | [X]/100 |
Critical Issues (Fix Immediately)
严重问题(立即修复)
[List each critical issue with specific page URLs and recommended fix]
[列出每个严重问题,包含具体页面URL和建议修复方案]
High Priority Issues
高优先级问题
[List each high-priority issue with details]
[列出每个高优先级问题及详细信息]
Medium Priority Issues
中优先级问题
[List each medium-priority issue]
[列出每个中优先级问题]
Low Priority Issues
低优先级问题
[List each low-priority issue]
[列出每个低优先级问题]
Category Deep Dives
维度深度分析
AI Citability ([X]/100)
AI可引用性([X]/100)
[Detailed findings, examples of good/bad passages, rewrite suggestions]
[详细发现、优质/劣质段落示例、重写建议]
Brand Authority ([X]/100)
品牌权威性([X]/100)
[Platform presence map, mention volume, sentiment]
[平台存在感图谱、提及数量、情感倾向]
Content E-E-A-T ([X]/100)
内容E-E-A-T([X]/100)
[Author quality, source citations, freshness, depth]
[作者质量、来源引用、新鲜度、深度]
Technical GEO ([X]/100)
GEO技术配置([X]/100)
[Crawler access, llms.txt, rendering, headers]
[爬虫访问权限、llms.txt、页面渲染、响应头]
Schema & Structured Data ([X]/100)
Schema与结构化数据([X]/100)
[Schema types found, validation results, missing opportunities]
[已发现的Schema类型、验证结果、缺失的优化机会]
Platform Optimization ([X]/100)
平台优化([X]/100)
[Presence on YouTube, Reddit, Wikipedia, etc.]
[在YouTube、Reddit、Wikipedia等平台的存在感]
Quick Wins (Implement This Week)
快速优化项(本周实施)
- [Specific, actionable quick win with expected impact]
- [Another quick win]
- [Another quick win]
- [Another quick win]
- [Another quick win]
- [具体、可落地的快速优化项及预期效果]
- [另一项快速优化项]
- [另一项快速优化项]
- [另一项快速优化项]
- [另一项快速优化项]
30-Day Action Plan
30天行动计划
Week 1: [Theme]
第1周:[主题]
- Action item 1
- Action item 2
- 行动项1
- 行动项2
Week 2: [Theme]
第2周:[主题]
- Action item 1
- Action item 2
- 行动项1
- 行动项2
Week 3: [Theme]
第3周:[主题]
- Action item 1
- Action item 2
- 行动项1
- 行动项2
Week 4: [Theme]
第4周:[主题]
- Action item 1
- Action item 2
- 行动项1
- 行动项2
Appendix: Pages Analyzed
附录:分析的页面列表
| URL | Title | GEO Issues |
|---|---|---|
| [url] | [title] | [issue count] |
---| URL | 标题 | GEO问题数量 |
|---|---|---|
| [url] | [title] | [issue count] |
---Quality Gates
质量管控规则
- Page Limit: Never crawl more than 50 pages per audit. Prioritize high-value pages.
- Timeout: 30-second maximum per page fetch. Skip pages that exceed this.
- Robots.txt: Always check and respect robots.txt before crawling. Note any AI-specific directives.
- Rate Limiting: Wait at least 1 second between page fetches to avoid overloading the server.
- Error Handling: Log failed fetches but continue the audit. Report fetch failures in the appendix.
- Content Type: Only analyze HTML pages. Skip PDFs, images, and other binary content.
- Deduplication: Canonicalize URLs before crawling. Skip duplicate content (e.g., HTTP vs HTTPS, www vs non-www, trailing slashes).
- 页面限制: 每次审计抓取的页面不得超过50个,优先处理高价值页面。
- 超时设置: 每个页面抓取的最长超时时间为30秒,跳过超时页面。
- Robots.txt规则: 抓取前务必检查并遵守,记录任何针对AI的特殊指令。
robots.txt - 速率限制: 页面抓取间隔至少1秒,避免给服务器造成过载。
- 错误处理: 记录抓取失败的页面,但继续执行审计,并在附录中报告抓取失败情况。
- 内容类型: 仅分析HTML页面,跳过PDF、图片及其他二进制内容。
- 去重处理: 抓取前对URL进行规范化处理,跳过重复内容(如HTTP与HTTPS、www与非www、末尾斜杠差异)。
Business-Type-Specific Audit Adjustments
针对不同业务类型的审计调整
SaaS Sites
SaaS网站
- Extra weight on: Feature comparison tables (high citability), integration pages, documentation quality
- Check for: API documentation structure, changelog pages, knowledge base organization
- Key schema: SoftwareApplication, FAQPage, HowTo
- 额外关注:功能对比表(高可引用性)、集成页面、文档质量
- 检查内容:API文档结构、更新日志页面、知识库组织
- 关键Schema:SoftwareApplication、FAQPage、HowTo
Local Businesses
本地商家
- Extra weight on: NAP consistency, Google Business Profile signals, local schema
- Check for: Service area pages, location-specific content, review markup
- Key schema: LocalBusiness, GeoCoordinates, OpeningHoursSpecification
- 额外关注:NAP一致性、Google商家资料信号、本地Schema
- 检查内容:服务区域页面、位置相关内容、评价标记
- 关键Schema:LocalBusiness、GeoCoordinates、OpeningHoursSpecification
E-commerce Sites
电商网站
- Extra weight on: Product descriptions (citability), comparison content, buying guides
- Check for: Product schema completeness, review aggregation, FAQ sections on product pages
- Key schema: Product, AggregateRating, Offer, BreadcrumbList
- 额外关注:产品描述(可引用性)、对比内容、购买指南
- 检查内容:产品Schema完整性、评价汇总、产品页面的FAQ板块
- 关键Schema:Product、AggregateRating、Offer、BreadcrumbList
Publishers
内容发布平台
- Extra weight on: Article quality, author credentials, source citation practices
- Check for: Article schema, author pages, publication date freshness, original research
- Key schema: Article, NewsArticle, Person (author), ClaimReview
- 额外关注:文章质量、作者资质、来源引用规范
- 检查内容:Article Schema、作者页面、发布日期新鲜度、原创研究
- 关键Schema:Article、NewsArticle、Person(作者)、ClaimReview
Agency/Services
代理/服务类网站
- Extra weight on: Case studies (citability), expertise demonstration, thought leadership
- Check for: Portfolio schema, team credentials, industry-specific expertise signals
- Key schema: Organization, Service, Person (team), Review
- 额外关注:案例研究(可引用性)、专业能力展示、思想领导力内容
- 检查内容:作品集Schema、团队资质、行业特定专业信号
- 关键Schema:Organization、Service、Person(团队成员)、Review