geo-review

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

GEO Review

GEO 评估

Evaluate how well your application and content are optimized for AI-powered search and answer engines — ChatGPT, Perplexity, Google AI Overviews, Claude, and other generative AI systems that cite web sources. Traditional SEO gets you ranked in a link list; GEO gets you cited in AI-generated answers.

评估你的应用和内容针对AI驱动搜索与问答引擎的优化程度——包括ChatGPT、Perplexity、Google AI Overviews、Claude以及其他会引用网络来源的生成式AI系统。传统SEO帮你在链接列表中获得排名；而GEO能让你在AI生成的答案中被引用。

When to use

适用场景

Use

/geo-review

when:

Your product is discovered through AI assistants (developer tools, SaaS, APIs)
You want to appear in Google AI Overviews
Users find your product by asking AI "what's the best X for Y?"
You publish documentation, guides, or educational content
Your competitors are showing up in AI answers and you're not
Building thought leadership content that AI should reference
Launching a new product where AI-driven discovery matters

在以下场景使用

/geo-review

：

你的产品通过AI助手被用户发现（开发工具、SaaS、API）
你希望出现在Google AI Overviews中
用户通过向AI提问“Y场景下最好的X是什么？”来寻找你的产品
你发布文档、指南或教育类内容
竞争对手出现在AI答案中，但你没有
打造AI应该参考的思想领导力内容
发布一款AI驱动发现至关重要的新产品

Why GEO Matters Now

GEO 当前的重要性

40% of Gen Z uses TikTok and AI chatbots instead of Google for search (Adobe 2024)
Google AI Overviews now appear for ~30% of search queries, pushing traditional results below the fold
Perplexity processes 100M+ queries/month, citing web sources in every answer
ChatGPT with browsing and search is becoming a primary research tool
AI systems don't rank links — they select and cite sources based on different signals than traditional SEO
Being the source an AI quotes is the new "position #1"

40%的Z世代使用TikTok和AI聊天机器人而非Google进行搜索（Adobe 2024）
Google AI Overviews现在出现在约30%的搜索查询结果中，将传统结果挤到页面下方
Perplexity每月处理1亿+查询，每个答案都会引用网络来源
带浏览功能的ChatGPT和搜索正在成为主要研究工具
AI系统不会对链接排名——它们基于与传统SEO不同的信号选择并引用来源
成为AI引用的来源是新的“排名第一”

Standards & Frameworks Referenced

参考的标准与框架

GEO research (Georgia Tech / Princeton / IIT Delhi, 2024) — "GEO: Generative Engine Optimization"
Google E-E-A-T — Experience, Expertise, Authoritativeness, Trustworthiness
Schema.org — Structured data for entity understanding
llms.txt — Emerging standard for AI crawler instructions (similar to robots.txt for LLMs)
Retrieval-Augmented Generation (RAG) — How AI systems fetch and cite content

GEO研究（佐治亚理工学院/普林斯顿大学/印度理工学院德里分校，2024）——《GEO：生成式引擎优化》
Google E-E-A-T——体验、专业度、权威性、可信度
Schema.org——用于实体理解的结构化数据标准
llms.txt——针对AI爬虫的新兴指令标准（类似面向LLM的robots.txt）
Retrieval-Augmented Generation (RAG)——AI系统获取并引用内容的方式

Phase Overview

阶段概述

Phase 1: EDUCATE   → How AI search works differently from traditional search
Phase 2: SCOPE     → Identify content types, target queries, AI visibility goals
Phase 3: ANALYZE   → Content analysis + browser-based AI search validation
Phase 4: REPORT    → Findings with citation gap analysis and confidence scores
Phase 5: REMEDIATE → Fix guidance + YAML regression tests

Phase 1: EDUCATE   → AI搜索与传统搜索的差异
Phase 2: SCOPE     → 识别内容类型、目标查询、AI可见性目标
Phase 3: ANALYZE   → 内容分析 + 基于浏览器的AI搜索验证
Phase 4: REPORT    → 包含引用差距分析和置信度得分的结果报告
Phase 5: REMEDIATE → 修复指南 + YAML回归测试

Phase 1: Educate

阶段1：教育引导

How AI search is different: Traditional search engines crawl, index, and rank pages by relevance signals (backlinks, keywords, authority). AI answer engines do something fundamentally different — they retrieve content, understand it semantically, and synthesize answers by selecting the most citation-worthy sources. Your content needs to be clear, specific, authoritative, and directly answerable to be selected.

Key insight: AI systems prefer content that makes specific, verifiable claims with supporting evidence. Vague marketing copy is ignored. Concrete statements with data, comparisons, and clear structure get cited.

AI搜索的不同之处： 传统搜索引擎抓取、索引并根据相关性信号（反向链接、关键词、权威性）对页面进行排名。AI问答引擎的运作方式完全不同——它们检索内容、进行语义理解，并通过选择最具引用价值的来源来合成答案。你的内容需要清晰、具体、权威且能直接回答问题，才会被选中。

核心洞察： AI系统偏好包含具体、可验证声明并带有支持证据的内容。模糊的营销文案会被忽略。带有数据、对比和清晰结构的具体陈述会被引用。

Phase 2: Scope

阶段2：范围界定

Gather context

收集背景信息

Auto-detect from codebase/content:
- Content pages (docs, blog, landing pages, about, pricing, FAQ)
- Existing structured data (JSON-LD, Schema.org)
- Content management approach (static, CMS, MDX, etc.)
- llms.txt presence
- Sitemap and content organization
- Author/expertise signals
- Publication dates and freshness signals
Ask the user (one at a time):
- Product type: What does your product/site do? (needed to understand AI query context)
- Target URL: Where is the content published?
- Target AI queries: What questions should AI answer with your content? (e.g., "best CI/CD tool for startups", "how to implement OAuth in Node.js")
- Competitors: Who else shows up when AI answers these queries? (optional but valuable)
- Content goals: Documentation? Thought leadership? Product discovery? All of the above?
Map content landscape:
- Key content pages and their purpose
- Target queries each page should satisfy
- Current AI citation status (test a few queries in ChatGPT/Perplexity)
- Content gaps vs competitors

从代码库/内容自动检测：
- 内容页面（文档、博客、着陆页、关于页、定价页、FAQ）
- 现有结构化数据（JSON-LD、Schema.org）
- 内容管理方式（静态页面、CMS、MDX等）
- llms.txt是否存在
- 站点地图与内容组织
- 作者/专业度信号
- 发布日期与新鲜度信号
向用户询问（逐一进行）：
- 产品类型：你的产品/网站是做什么的？（需要了解AI查询的上下文）
- 目标URL：内容发布在哪里？
- 目标AI查询：AI应该用你的内容回答哪些问题？（例如：“初创企业最佳CI/CD工具”、“如何在Node.js中实现OAuth”）
- 竞争对手：AI回答这些查询时还会出现哪些竞品？（可选但有价值）
- 内容目标：文档？思想领导力？产品发现？以上全部？
绘制内容版图：
- 关键内容页面及其用途
- 每个页面应满足的目标查询
- 当前AI引用状态（在ChatGPT/Perplexity中测试几个查询）
- 与竞争对手的内容差距

Phase 3: Analyze

阶段3：分析

Open a browser session with

new_session

using

record_evidence: true

. Run all applicable check categories.

使用

new_session

打开浏览器会话，设置

record_evidence: true

。运行所有适用的检查类别。

Category A: Content Citation-Worthiness (CITE)

类别A：内容可引用性（CITE）

Check ID	Check	Principle	Method
CITE-01	Content contains specific, verifiable claims	GEO research	Scan pages for concrete statements with data/numbers
CITE-02	Statistics and original data are present	GEO research	Check for unique numbers, benchmarks, research findings
CITE-03	Content directly answers target queries	RAG retrieval	Match content against target queries — does it contain direct answers?
CITE-04	Claims have supporting evidence or citations	E-E-A-T	Check for source references, links, data attribution
CITE-05	Content is specific (not generic/vague)	GEO research	Analyze content for specificity vs marketing fluff
CITE-06	Comparison content exists (vs alternatives)	AI preference	Check for "X vs Y" or comparison tables that AI can cite
CITE-07	Content has clear, quotable summary sentences	Citation format	Check if key paragraphs start with citable claims
CITE-08	Unique perspective or data (not regurgitated)	E-E-A-T	Assess originality — does this add something AI can't already synthesize?
CITE-09	Content demonstrates first-hand experience	E-E-A-T (Experience)	Check for case studies, personal experience, real examples
CITE-10	Technical accuracy and depth	E-E-A-T (Expertise)	Assess whether content goes beyond surface level

Browser validation: Navigate to content pages. Extract text content. Analyze for claim density, statistics, quotable statements. Compare against target queries for direct answer matching.

检查ID	检查项	原则	方法
CITE-01	内容包含具体、可验证的声明	GEO研究	扫描页面，查找带有数据/数字的具体陈述
CITE-02	包含统计数据和原创数据	GEO研究	检查是否有独特的数字、基准、研究结果
CITE-03	内容直接回答目标查询	RAG检索	将内容与目标查询匹配——是否包含直接答案？
CITE-04	声明带有支持证据或引用	E-E-A-T	检查是否有来源参考、链接、数据归因
CITE-05	内容具体（非通用/模糊）	GEO研究	分析内容的具体性与营销套话占比
CITE-06	存在对比内容（与替代方案）	AI偏好	检查是否有“X vs Y”或AI可引用的对比表格
CITE-07	内容有清晰、可引用的总结句	引用格式	检查关键段落是否以可引用的声明开头
CITE-08	独特视角或数据（非重复内容）	E-E-A-T	评估原创性——是否提供了AI无法自行合成的内容？
CITE-09	内容展示第一手经验	E-E-A-T（体验）	检查是否有案例研究、个人经验、真实示例
CITE-10	技术准确性与深度	E-E-A-T（专业度）	评估内容是否超越表面层次

浏览器验证： 导航到内容页面。提取文本内容。分析声明密度、统计数据、可引用陈述。与目标查询对比，检查是否匹配直接答案。

Category B: Content Structure for AI Retrieval (STRUCT)

类别B：AI检索的内容结构（STRUCT）

Check ID	Check	Principle	Method
STRUCT-01	Clear heading hierarchy maps to questions	RAG chunking	Check if H2/H3 headings are question-shaped or topic-clear
STRUCT-02	FAQ sections with direct Q&A format	AI preference	Check for FAQ sections, question-answer pairs
STRUCT-03	Definition/explanation paragraphs lead with the answer	Retrieval	Check if paragraphs front-load the key claim (inverted pyramid)
STRUCT-04	Tables and structured comparisons present	AI preference	Check for HTML tables with clear headers
STRUCT-05	Content is chunked into digestible sections (300-500 words)	RAG chunking	Measure section lengths between headings
STRUCT-06	Lists used for multi-point information	AI preference	Check for ordered/unordered lists for multi-step or multi-item content
STRUCT-07	Code examples are complete and runnable (for technical content)	Developer experience	Check code blocks for completeness and language tags
STRUCT-08	TL;DR or summary at top of long content	Retrieval	Check for executive summary or key takeaways section

Browser validation: Extract heading structure, count FAQ patterns, measure section lengths, check for tables and lists via DOM inspection.

检查ID	检查项	原则	方法
STRUCT-01	清晰的标题层级对应问题	RAG分块	检查H2/H3标题是否为问题形式或主题明确
STRUCT-02	带有直接问答格式的FAQ板块	AI偏好	检查是否有FAQ板块、问答对
STRUCT-03	定义/解释段落先给出答案	检索需求	检查段落是否前置核心声明（倒金字塔结构）
STRUCT-04	存在表格和结构化对比	AI偏好	检查是否有带清晰表头的HTML表格
STRUCT-05	内容被拆分为易读的小节（300-500字）	RAG分块	测量标题之间的小节长度
STRUCT-06	使用列表呈现多点信息	AI偏好	检查是否使用有序/无序列表展示多步骤或多项目内容
STRUCT-07	代码示例完整且可运行（针对技术内容）	开发者体验	检查代码块的完整性和语言标签
STRUCT-08	长内容顶部有TL;DR或摘要	检索需求	检查是否有执行摘要或关键要点板块

浏览器验证： 提取标题结构，统计FAQ模式，测量小节长度，通过DOM检查表格和列表。

Category C: Authority & Trust Signals (AUTH)

类别C：权威性与信任信号（AUTH）

Check ID	Check	Principle	Method
AUTH-01	Author information present (name, bio, credentials)	E-E-A-T	Check for author bylines, about sections
AUTH-02	Organization/brand identity clear	Entity recognition	Check for About page, consistent branding
AUTH-03	Publication and update dates visible	Freshness	Check for date metadata on content pages
AUTH-04	Sources and references cited	E-E-A-T	Check for outbound links to authoritative sources
AUTH-05	Testimonials/social proof present	Trust	Check for customer quotes, logos, case studies
AUTH-06	Professional contact information available	Trust	Check for contact page, physical address, support channels
AUTH-07	Content recency (updated within last 12 months)	Freshness	Check publish/update dates
AUTH-08	Domain authority indicators (established site)	E-E-A-T	Check site age, about page depth, team page

Browser validation: Navigate to content pages, about page, author pages. Extract dates, author info, citation links.

检查ID	检查项	原则	方法
AUTH-01	存在作者信息（姓名、简介、资质）	E-E-A-T	检查是否有作者署名、关于板块
AUTH-02	组织/品牌身份清晰	实体识别	检查是否有关于页、一致的品牌标识
AUTH-03	可见的发布和更新日期	新鲜度	检查内容页面的日期元数据
AUTH-04	引用来源和参考文献	E-E-A-T	检查是否有指向权威来源的出站链接
AUTH-05	存在推荐语/社交证明	可信度	检查是否有客户评价、品牌标志、案例研究
AUTH-06	提供专业联系信息	可信度	检查是否有联系页、物理地址、支持渠道
AUTH-07	内容时效性（过去12个月内更新）	新鲜度	检查发布/更新日期
AUTH-08	域名权威性指标（成熟站点）	E-E-A-T	检查站点年龄、关于页深度、团队页

浏览器验证： 导航到内容页面、关于页、作者页。提取日期、作者信息、引用链接。

Category D: Technical AI Discoverability (TECH)

类别D：AI技术可发现性（TECH）

Check ID	Check	Principle	Method
TECH-01	llms.txt present at site root	AI crawler standard	Fetch /llms.txt, check format and content
TECH-02	llms-full.txt with detailed content (if applicable)	AI crawler standard	Fetch /llms-full.txt
TECH-03	JSON-LD structured data with rich entity info	Schema.org	Check for Organization, Product, Article, FAQ schema
TECH-04	Content accessible without JavaScript	RAG crawling	Disable JS, check if content renders
TECH-05	Clean, semantic HTML (not framework soup)	Crawlability	Check for meaningful tags vs div-heavy DOM
TECH-06	robots.txt allows AI crawlers	Discoverability	Check for GPTBot, ClaudeBot, PerplexityBot, Bingbot rules
TECH-07	Sitemap includes content pages with lastmod	Discoverability	Check sitemap for content pages and dates
TECH-08	Open Graph tags help AI understand content	Social + AI	Check OG tags for accurate content description
TECH-09	API documentation is machine-readable (if applicable)	Developer GEO	Check for OpenAPI spec, API reference format
TECH-10	Content is not behind authentication walls	RAG access	Verify key content is publicly accessible

Browser validation: Fetch llms.txt, check robots.txt for AI bot rules, verify SSR content, inspect structured data.

检查ID	检查项	原则	方法
TECH-01	站点根目录存在llms.txt	AI爬虫标准	获取/llms.txt，检查格式和内容
TECH-02	存在包含详细内容的llms-full.txt（如适用）	AI爬虫标准	获取/llms-full.txt
TECH-03	带有丰富实体信息的JSON-LD结构化数据	Schema.org	检查是否有Organization、Product、Article、FAQ schema
TECH-04	无需JavaScript即可访问内容	RAG爬取	禁用JS，检查内容是否可渲染
TECH-05	简洁、语义化的HTML（非框架冗余代码）	可爬取性	检查是否有有意义的标签而非大量div的DOM
TECH-06	robots.txt允许AI爬虫	可发现性	检查GPTBot、ClaudeBot、PerplexityBot、Bingbot的规则
TECH-07	站点地图包含带lastmod的内容页面	可发现性	检查站点地图中的内容页面和日期
TECH-08	Open Graph标签帮助AI理解内容	社交+AI	检查OG标签的内容描述是否准确
TECH-09	API文档可被机器读取（如适用）	开发者GEO	检查是否有OpenAPI规范、API参考格式
TECH-10	内容无需认证即可访问	RAG访问	验证关键内容是否公开可访问

浏览器验证： 获取llms.txt，检查robots.txt中的AI机器人规则，验证SSR内容，检查结构化数据。

Category E: Entity & Brand Clarity (ENTITY)

类别E：实体与品牌清晰度（ENTITY）

Check ID	Check	Principle	Method
ENTITY-01	Product/brand name is consistently used	Entity recognition	Check name consistency across pages
ENTITY-02	Clear product category declaration	AI classification	Check if content states "X is a [category]" explicitly
ENTITY-03	Key features/differentiators stated clearly	AI comparison	Check for feature lists, unique value propositions
ENTITY-04	Use case descriptions are specific	AI recommendation	Check for "best for [specific use case]" patterns
ENTITY-05	Pricing/tier information is structured	AI recommendation	Check pricing page for clear, structured plans
ENTITY-06	Integration/compatibility information present	AI recommendation	Check for "works with X" / integration pages
ENTITY-07	Competitor differentiation is factual	AI comparison	Check comparison content for factual (not just marketing) claims
ENTITY-08	Industry/vertical targeting is explicit	AI classification	Check if content targets specific industries/roles

Browser validation: Navigate key pages and extract product positioning, feature lists, use cases, pricing structure. Check for entity-clear statements.

检查ID	检查项	原则	方法
ENTITY-01	产品/品牌名称使用一致	实体识别	检查跨页面的名称一致性
ENTITY-02	清晰声明产品类别	AI分类	检查内容是否明确说明“X是[类别]”
ENTITY-03	清晰陈述核心功能/差异化点	AI对比	检查是否有功能列表、独特价值主张
ENTITY-04	使用场景描述具体	AI推荐	检查是否有“适用于[特定场景]”的模式
ENTITY-05	定价/层级信息结构化	AI推荐	检查定价页是否有清晰、结构化的方案
ENTITY-06	存在集成/兼容性信息	AI推荐	检查是否有“与X兼容”/集成页面
ENTITY-07	竞品差异化基于事实	AI对比	检查对比内容是否基于事实（而非仅营销话术）
ENTITY-08	明确针对行业/垂直领域	AI分类	检查内容是否针对特定行业/角色

浏览器验证： 导航关键页面，提取产品定位、功能列表、使用场景、定价结构。检查实体清晰的陈述。

Category F: AI Citation Testing (TEST)

类别F：AI引用测试（TEST）

This category is unique to GEO — it tests actual AI visibility.

Check ID	Check	Method
TEST-01	Test target queries in Perplexity	Navigate to perplexity.ai, search target queries, check if your site is cited
TEST-02	Test target queries in ChatGPT (if browsing available)	Search via ChatGPT, check citations
TEST-03	Test target queries in Google (check AI Overview)	Google search, check if AI Overview cites your content
TEST-04	Compare citation frequency vs competitors	Count citations for you vs top competitors across queries
TEST-05	Analyze what content IS being cited (from competitors)	Study cited content format, structure, claims

Browser validation: Use

new_session

to navigate to Perplexity and Google. Search target queries. Screenshot results. Check for citations to the user's domain. This provides real-world evidence of current AI visibility.

Important: TEST category results are the ground truth — they show whether your content is actually being cited, regardless of what the other categories suggest.

此类别为GEO独有——测试实际AI可见性。

检查ID	检查项	方法
TEST-01	在Perplexity中测试目标查询	导航到perplexity.ai，搜索目标查询，检查你的站点是否被引用
TEST-02	在ChatGPT中测试目标查询（若有浏览功能）	通过ChatGPT搜索，检查引用情况
TEST-03	在Google中测试目标查询（检查AI Overview）	Google搜索，检查AI Overview是否引用你的内容
TEST-04	对比与竞争对手的引用频率	统计你与顶级竞品在各查询中的引用次数
TEST-05	分析竞品被引用的内容特征	研究被引用内容的格式、结构、声明

浏览器验证： 使用

new_session

导航到Perplexity和Google。搜索目标查询。截图结果。检查是否引用用户的域名。这提供了当前AI可见性的真实证据。

重要提示： TEST类别的结果是客观事实——无论其他类别结果如何，它都能显示你的内容是否真正被引用。

Phase 4: Report

阶段4：报告

Generate a structured report saved to

shiplight/reports/geo-review-{date}.md

markdown

undefined

生成结构化报告并保存到

shiplight/reports/geo-review-{date}.md

：

markdown

undefined

GEO Review Report

GEO 评估报告

Date: {date} URL: {url} Product type: {description} Target AI queries tested: {list}

日期： {date} URL： {url} 产品类型： {description} 测试的目标AI查询： {list}

Overall GEO Score: {X}/10 | Confidence: {X}%

总体GEO得分：{X}/10 | 置信度：{X}%

Score Breakdown

得分细分

Category	Score	Findings
Citation-Worthiness (CITE)	5/10	2 high, 2 medium
Content Structure (STRUCT)	6/10	1 high, 2 medium
Authority Signals (AUTH)	7/10	1 medium
Technical Discoverability (TECH)	4/10	1 critical, 2 high
Entity Clarity (ENTITY)	5/10	2 high
AI Citation Testing (TEST)	3/10	Not cited in 4/5 target queries

类别	得分	发现
可引用性（CITE）	5/10	2个高优先级问题，2个中优先级问题
内容结构（STRUCT）	6/10	1个高优先级问题，2个中优先级问题
权威性信号（AUTH）	7/10	1个中优先级问题
技术可发现性（TECH）	4/10	1个关键问题，2个高优先级问题
实体清晰度（ENTITY）	5/10	2个高优先级问题
AI引用测试（TEST）	3/10	5个目标查询中有4个未被引用

AI Citation Status

AI引用状态

Target Query	Perplexity	Google AI Overview	Cited?	Competitor Cited?
"best X for Y"	Not cited	Not in overview	❌	CompetitorA: ✅
"how to do Z"	Cited (#3 source)	Cited	✅	CompetitorB: ✅
...

目标查询	Perplexity	Google AI Overview	是否被引用？	竞品是否被引用？
"best X for Y"	未被引用	未出现在Overview中	❌	CompetitorA: ✅
"how to do Z"	被引用（第3来源）	被引用	✅	CompetitorB: ✅
...

Citation Gap Analysis

引用差距分析

What competitors' cited content has that yours doesn't:

Specific performance benchmarks (CompetitorA cites "40% faster than...")
Comparison tables (CompetitorB has detailed feature matrices)
Direct answer paragraphs (CompetitorA leads sections with the conclusion)

竞品被引用的内容具备而你没有的特征：

具体性能基准（CompetitorA引用“比...快40%”）
对比表格（CompetitorB有详细的功能矩阵）
直接回答段落（CompetitorA小节开头即给出结论）

Findings

发现

(structured findings with evidence and priority)

undefined

（带有证据和优先级的结构化结果）

undefined

Confidence Scoring

置信度评分

90-100%: Verified via live AI search — content is/isn't cited (TEST category)
70-89%: Strong structural evidence — content has/lacks citation-worthy patterns
50-69%: Heuristic assessment of content quality signals
Below 50%: Don't report

90-100%：通过实时AI搜索验证——内容已被/未被引用（TEST类别）
70-89%：强有力的结构证据——内容具备/缺乏可引用模式
50-69%：对内容质量信号的启发式评估
低于50%：不生成报告

Phase 5: Remediate

阶段5：修复优化

1. Fix guidance (example)

1. 修复指南（示例）

markdown

undefined

markdown

undefined

CITE-01: Landing page lacks specific, verifiable claims

CITE-01：着陆页缺乏具体、可验证的声明

Impact: AI systems skip vague marketing copy — your landing page is invisible to AI answers Current: "We're the fastest platform for modern teams" Fix: Add specific, citable claims:

"Deploys complete in 47 seconds on average (based on 10,000 deployments in Q4 2025)"
"Used by 2,300 companies including [notable names]"
"Reduces CI/CD pipeline time by 62% compared to Jenkins (internal benchmark, Jan 2026)" Principle: AI cites facts, not adjectives. Every claim should be verifiable.


```markdown

影响： AI系统会跳过模糊的营销文案——你的着陆页在AI答案中不可见 当前状态： "我们是面向现代团队的最快平台" 修复方案： 添加具体、可引用的声明：

"平均部署完成时间为47秒（基于2025年Q4的10,000次部署数据）"
"被2,300家公司使用，包括[知名企业]"
"与Jenkins相比，将CI/CD流水线时间缩短62%（内部基准数据，2026年1月）" 原则： AI引用事实，而非形容词。每个声明都应可验证。


```markdown

TECH-01: No llms.txt present

TECH-01：不存在llms.txt

Impact: AI crawlers have no guidance on how to understand your site Fix: Create /llms.txt at site root:

影响： AI爬虫没有理解你的站点的指导规则 修复方案： 在站点根目录创建/llms.txt：

[Your Product Name]

[你的产品名称]

One-sentence description of what your product does.

一句话描述你的产品功能。

Docs

文档

Getting Started: How to set up and configure [Product]
API Reference: Complete API documentation
Guides: Step-by-step tutorials

快速入门：如何设置和配置[产品]
API参考：完整的API文档
指南：分步教程

Key Pages

关键页面

Pricing: Plans and pricing
Changelog: Recent updates and releases
About: Company and team information

Also create /llms-full.txt with expanded content for deeper AI understanding.

undefined

定价：方案与定价
更新日志](/changelog)：近期更新与版本发布
关于我们：公司与团队信息

同时创建/llms-full.txt，包含扩展内容以帮助AI更深入理解。

undefined

2. YAML regression tests

2. YAML回归测试

yaml

- name: tech-01-llms-txt-present
  description: Verify llms.txt exists and is properly formatted
  severity: high
  standard: llms-txt-standard
  steps:
    - URL: /llms.txt
    - VERIFY: The page loads successfully and contains structured information about the site
    - CODE: |
        const content = await page.textContent('body');
        if (!content || content.trim().length < 50) {
          throw new Error('llms.txt is missing or too short');
        }
        if (!content.includes('#')) {
          throw new Error('llms.txt should use markdown heading structure');
        }
        console.log(`llms.txt found (${content.length} chars)`);

- name: tech-06-ai-crawlers-allowed
  description: Verify robots.txt allows AI search crawlers
  severity: high
  standard: AI-Discoverability
  steps:
    - URL: /robots.txt
    - CODE: |
        const content = await page.textContent('body');
        const blockedBots = ['GPTBot', 'ClaudeBot', 'PerplexityBot', 'Google-Extended'];
        const blocked = blockedBots.filter(bot => {
          const pattern = new RegExp(`User-agent:\\s*${bot}[\\s\\S]*?Disallow:\\s*/`, 'i');
          return pattern.test(content);
        });
        if (blocked.length > 0) {
          throw new Error(`AI crawlers blocked in robots.txt: ${blocked.join(', ')}`);
        }
        console.log('All major AI crawlers are allowed');
    - VERIFY: robots.txt does not block major AI search engine crawlers

- name: cite-01-specific-claims-present
  description: Verify key pages contain specific, citable claims with data
  severity: high
  standard: GEO-Citation-Worthiness
  steps:
    - URL: /
    - CODE: |
        const text = await page.textContent('main') || await page.textContent('body');
        // Check for specific numbers/statistics
        const hasNumbers = /\d+[%xX]|\$[\d,.]+|\d{1,3}(,\d{3})+|\d+\s*(users|customers|companies|teams|downloads)/i.test(text);
        if (!hasNumbers) {
          throw new Error('Landing page lacks specific statistics or data points that AI can cite');
        }
        console.log('Found specific, citable claims with data');
    - VERIFY: Landing page contains specific statistics, benchmarks, or verifiable data points

Save all YAML tests to

shiplight/tests/geo-review.test.yaml

yaml

- name: tech-01-llms-txt-present
  description: Verify llms.txt exists and is properly formatted
  severity: high
  standard: llms-txt-standard
  steps:
    - URL: /llms.txt
    - VERIFY: The page loads successfully and contains structured information about the site
    - CODE: |
        const content = await page.textContent('body');
        if (!content || content.trim().length < 50) {
          throw new Error('llms.txt is missing or too short');
        }
        if (!content.includes('#')) {
          throw new Error('llms.txt should use markdown heading structure');
        }
        console.log(`llms.txt found (${content.length} chars)`);

- name: tech-06-ai-crawlers-allowed
  description: Verify robots.txt allows AI search crawlers
  severity: high
  standard: AI-Discoverability
  steps:
    - URL: /robots.txt
    - CODE: |
        const content = await page.textContent('body');
        const blockedBots = ['GPTBot', 'ClaudeBot', 'PerplexityBot', 'Google-Extended'];
        const blocked = blockedBots.filter(bot => {
          const pattern = new RegExp(`User-agent:\\s*${bot}[\\s\\S]*?Disallow:\\s*/`, 'i');
          return pattern.test(content);
        });
        if (blocked.length > 0) {
          throw new Error(`AI crawlers blocked in robots.txt: ${blocked.join(', ')}`);
        }
        console.log('All major AI crawlers are allowed');
    - VERIFY: robots.txt does not block major AI search engine crawlers

- name: cite-01-specific-claims-present
  description: Verify key pages contain specific, citable claims with data
  severity: high
  standard: GEO-Citation-Worthiness
  steps:
    - URL: /
    - CODE: |
        const text = await page.textContent('main') || await page.textContent('body');
        // Check for specific numbers/statistics
        const hasNumbers = /\d+[%xX]|\$[\d,.]+|\d{1,3}(,\d{3})+|\d+\s*(users|customers|companies|teams|downloads)/i.test(text);
        if (!hasNumbers) {
          throw new Error('Landing page lacks specific statistics or data points that AI can cite');
        }
        console.log('Found specific, citable claims with data');
    - VERIFY: Landing page contains specific statistics, benchmarks, or verifiable data points

将所有YAML测试保存到

shiplight/tests/geo-review.test.yaml

。

Depth Levels

深度级别

--quick
: llms.txt check + robots.txt AI crawler check + landing page claim analysis. ~2 minutes.
default: All content categories + 3 target query tests in Perplexity. ~10-15 minutes.
--thorough
: All categories + full AI citation testing across multiple engines + competitor citation analysis + content gap recommendations. ~25-40 minutes.

--quick
：llms.txt检查 + robots.txt AI爬虫检查 + 着陆页声明分析。约2分钟。
默认：所有内容类别 + 在Perplexity中测试3个目标查询。约10-15分钟。
--thorough
：所有类别 + 多引擎完整AI引用测试 + 竞品引用分析 + 内容差距建议。约25-40分钟。

Tips

提示

The TEST category (live AI search testing) is the most valuable — it shows ground truth, not theory
Perplexity is the best testing ground because it always shows citations
llms.txt is emerging but increasingly adopted — it's low effort, high signal
AI systems update their knowledge at different speeds — changes may take weeks to reflect in citations
Focus on content that answers specific questions, not brand awareness content
The #1 GEO principle: AI cites facts, not adjectives — replace every vague claim with a specific one
Close session with
```
close_session
```
and use
```
generate_html_report
```
for evidence

TEST类别（实时AI搜索测试）最有价值——它展示客观事实，而非理论
Perplexity是最佳测试平台，因为它始终显示引用来源
llms.txt是新兴标准但被越来越多采用——投入少，信号价值高
AI系统更新知识的速度不同——更改可能需要数周才能反映在引用中
专注于回答具体问题的内容，而非品牌宣传内容
GEO核心原则：AI引用事实，而非形容词——用具体声明替换所有模糊表述
使用
```
close_session
```
关闭会话，并用
```
generate_html_report
```
生成证据报告