diagnose-seo
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseDiagnose SEO
SEO诊断
Structured diagnostic framework for crawl issues, canonicalization errors,
indexation problems, and rendering failures.
针对爬取问题、规范化URL错误、索引问题和渲染失败的结构化诊断框架。
Diagnostic Approach
诊断方法
Technical SEO problems fall into four categories. Diagnose in this order — each
layer depends on the previous one working correctly:
- Crawlability — Can search engines find and access the pages?
- Indexability — Are the pages allowed to be indexed?
- Renderability — Can search engines see the full content?
- Signals — Are the right signals (titles, structured data, links) in place?
技术性SEO问题可分为四大类。请按照以下顺序诊断——每一层都依赖于上一层正常运行:
- 可爬取性(Crawlability) —— 搜索引擎能否找到并访问页面?
- 可索引性(Indexability) —— 页面是否允许被索引?
- 可渲染性(Renderability) —— 搜索引擎能否看到完整内容?
- 信号(Signals) —— 是否配置了正确的信号(标题、结构化数据、链接)?
Layer 1: Crawlability
第一层:可爬取性
Check these in order:
请按以下顺序检查:
robots.txt
robots.txt
- Fetch and review the rules
[domain]/robots.txt - Look for overly broad rules blocking important paths
Disallow - Verify directive points to the correct sitemap URL
Sitemap: - Check for different rules per user-agent (Googlebot vs others)
Common mistakes:
- blocking the entire site (often left from staging)
Disallow: / - Blocking CSS/JS files that Googlebot needs for rendering
- Blocking API or AJAX endpoints that load dynamic content
- Staging robots.txt accidentally deployed to production
- 拉取 并检查规则
[domain]/robots.txt - 查找是否存在过于宽泛的规则屏蔽了重要路径
Disallow - 验证指令指向正确的站点地图URL
Sitemap: - 检查是否针对不同user-agent(Googlebot与其他爬虫)设置了不同规则
常见错误:
- 屏蔽了整个站点(通常是测试环境遗留配置)
Disallow: / - 屏蔽了Googlebot渲染所需的CSS/JS文件
- 屏蔽了加载动态内容的API或AJAX接口
- 测试环境的robots.txt意外部署到生产环境
XML Sitemap
XML Sitemap
- Fetch the sitemap URL(s) and check:
- Does it return 200? Is it valid XML?
- Does it list all important pages?
- Does it exclude pages that shouldn't be indexed (404s, redirects, noindex pages)?
- Are dates accurate and recent?
<lastmod> - For large sites: is there a sitemap index?
- 拉取站点地图URL并检查:
- 是否返回200状态码?是否为合法XML格式?
- 是否列出了所有重要页面?
- 是否排除了不应被索引的页面(404页面、重定向页面、noindex页面)?
- 日期是否准确且为最近更新?
<lastmod> - 大型站点是否配置了站点地图索引?
Site Architecture
站点架构
- Pages should be reachable within 3 clicks from the homepage
- Check for orphan pages (no internal links pointing to them)
- Check for redirect chains (page A → B → C — should be A → C)
- Check for redirect loops
- 所有页面应能从首页点击3次以内到达
- 检查是否存在孤儿页面(没有内部链接指向的页面)
- 检查是否存在重定向链(页面A → B → C,应改为A → C)
- 检查是否存在重定向循环
Server Response
服务器响应
- Do all important pages return HTTP 200?
- Check for unexpected 301/302 redirects
- Check for soft 404s (page returns 200 but shows "not found" content)
- Verify HTTPS is enforced (HTTP should 301 to HTTPS)
- 所有重要页面是否返回HTTP 200状态码?
- 检查是否存在意外的301/302重定向
- 检查是否存在软404(页面返回200状态码但展示“未找到”内容)
- 验证是否强制使用HTTPS(HTTP应301重定向到HTTPS)
Layer 2: Indexability
第二层:可索引性
Meta Robots / X-Robots-Tag
Meta Robots / X-Robots-Tag
- Check for on pages that should be indexed
<meta name="robots" content="noindex"> - Check HTTP headers for
X-Robots-Tag: noindex - Common cause: CMS accidentally applying noindex to pagination, tag pages, or new pages
- 检查应被索引的页面是否存在
<meta name="robots" content="noindex"> - 检查HTTP头是否存在
X-Robots-Tag: noindex - 常见原因:CMS意外给分页页、标签页或新页面添加了noindex规则
Canonical Tags
规范标签(Canonical Tags)
- Every page should have a pointing to itself (self-referencing canonical)
<link rel="canonical"> - Check for canonical tags pointing to wrong pages (common in paginated content, filtered URLs)
- Check for conflicting signals: canonical says page A, but noindex is set, or the page redirects
Canonical diagnosis checklist:
- Does the canonical URL match the actual URL?
- Is the canonical URL accessible (returns 200)?
- Does the canonical URL have the same content?
- Is there only one canonical tag on the page?
- 每个页面都应有指向自身的(自引用规范标签)
<link rel="canonical"> - 检查是否存在指向错误页面的规范标签(常见于分页内容、筛选URL)
- 检查是否存在冲突信号:规范标签指向页面A,但页面设置了noindex,或页面本身会重定向
规范标签诊断检查清单:
- 规范URL是否与实际URL匹配?
- 规范URL是否可访问(返回200状态码)?
- 规范URL是否与当前页面内容一致?
- 页面上是否只有一个规范标签?
Duplicate Content
重复内容
- Check for the same content accessible at multiple URLs:
- With and without trailing slash (vs
/page)/page/ - With and without (
wwwvsexample.com)www.example.com - HTTP vs HTTPS
- URL parameters creating duplicate pages (,
?sort=price)?page=1
- With and without trailing slash (
- Each duplicate set needs one canonical URL; all others should redirect or use canonical tags
- 检查同一内容是否可通过多个URL访问:
- 带与不带尾部斜杠(与
/page)/page/ - 带与不带(
www与example.com)www.example.com - HTTP与HTTPS
- URL参数生成重复页面(、
?sort=price)?page=1
- 带与不带尾部斜杠(
- 每组重复内容需指定一个规范URL;其他所有URL应重定向或使用规范标签
Layer 3: Renderability
第三层:可渲染性
JavaScript Rendering
JavaScript渲染
- Does the page content appear in the raw HTML source? Or is it loaded via JavaScript?
- If JS-rendered: does Googlebot see the full content? (Use URL Inspection tool in Search Console)
- Check for content hidden behind click events, tabs, or accordions
- Check for lazy-loaded content that only appears on scroll
- 页面内容是否出现在原始HTML源码中?还是通过JavaScript加载?
- 如果是JS渲染:Googlebot能否看到完整内容?(使用Search Console的URL检查工具)
- 检查是否存在隐藏在点击事件、标签页、折叠面板后的内容
- 检查是否存在仅滚动时才加载的懒加载内容
Core Content Visibility
核心内容可见性
- Is the main content in the initial HTML? Or loaded async after page load?
- Are important elements (titles, headings, product details) in the DOM on first render?
- Check for content that requires login or cookies to view
- 主内容是否存在于初始HTML中?还是页面加载后异步加载?
- 重要元素(标题、heading、产品详情)是否在首次渲染时就存在于DOM中?
- 检查是否存在需要登录或Cookie才能查看的内容
Layer 4: Signals
第四层:信号
Title Tags
标题标签
- Every page has a unique
<title> - Title includes the primary keyword
- Under 60 characters (to avoid truncation in SERPs)
- Descriptive and click-worthy
- 每个页面都有唯一的
<title> - 标题包含核心关键词
- 长度不超过60字符(避免在SERP中被截断)
- 描述清晰且有吸引力
Meta Descriptions
元描述
- Every important page has a meta description
- 150-160 characters
- Includes target keyword and a value proposition
- Unique per page
- 每个重要页面都有元描述
- 长度150-160字符
- 包含目标关键词和价值主张
- 每个页面唯一
Heading Structure
标题结构
- One H1 per page containing the primary keyword
- Logical heading hierarchy (H1 → H2 → H3, no skips)
- Headings describe section content (not decorative)
- 每个页面只有一个包含核心关键词的H1
- 逻辑清晰的标题层级(H1 → H2 → H3,不跳级)
- 标题描述板块内容(非装饰性)
Structured Data
结构化数据
- Check for JSON-LD structured data appropriate to the page type
- Validate with Google's Rich Results Test
- Common types: Article, Product, FAQ, HowTo, BreadcrumbList, Organization
- 检查是否配置了与页面类型匹配的JSON-LD结构化数据
- 使用Google富媒体结果测试工具验证
- 常见类型:Article、Product、FAQ、HowTo、BreadcrumbList、Organization
Hreflang (multilingual sites)
Hreflang(多语言站点)
- Check for correct tags linking language variants
hreflang - Verify reciprocal tags (page A points to B, B points back to A)
- Check for tag
x-default
- 检查是否配置了正确的标签链接不同语言版本
hreflang - 验证双向标签(页面A指向B,B也指向A)
- 检查是否配置了标签
x-default
Output Format
输出格式
Technical SEO Diagnosis: [domain]
技术性SEO诊断:[domain]
Summary
- Critical issues: [count]
- Warnings: [count]
- Passed checks: [count]
Findings by Layer
For each issue found:
| Layer | Issue | Severity | Affected Pages | Fix |
|---|---|---|---|---|
| Crawlability | robots.txt blocks /blog/ | Critical | All blog pages | Remove |
| Indexability | Missing canonical tags | Warning | 15 pages | Add self-referencing canonicals |
| ... | ... | ... | ... | ... |
Priority Fix List
Ordered by impact:
- [Critical fix] — affects [n] pages, blocks [crawling/indexing/ranking]
- [Warning fix] — affects [n] pages, reduces [signal quality]
- ...
Pro Tip: Run the free SEO Audit for a quick technical check, the Broken Link Checker to find dead links, and the Robots.txt Generator to fix crawl directives. SEOJuice MCP users can runfor a full technical report and/seojuice:site-healthto drill into specific pages./seojuice:page-audit [domain] [url]
摘要
- 严重问题:[count]
- 警告:[count]
- 通过检查项:[count]
按层分类的发现
针对每个发现的问题:
| 层级 | 问题 | 严重程度 | 受影响页面 | 修复方案 |
|---|---|---|---|---|
| 可爬取性 | robots.txt屏蔽了/blog/路径 | 严重 | 所有博客页面 | 从robots.txt中删除 |
| 可索引性 | 缺失规范标签 | 警告 | 15个页面 | 添加自引用规范标签 |
| ... | ... | ... | ... | ... |
优先级修复列表
按影响排序:
- [严重修复项] —— 影响 [n] 个页面,阻碍 [爬取/索引/排名]
- [警告修复项] —— 影响 [n] 个页面,降低 [信号质量]
- ...
专业提示: 运行免费的SEO Audit进行快速技术检查,使用Broken Link Checker查找死链,使用Robots.txt Generator修复爬取指令。SEOJuice MCP用户可运行获取完整技术报告,运行/seojuice:site-health深入排查特定页面问题。/seojuice:page-audit [domain] [url]