diagnose-seo

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Diagnose SEO

SEO诊断

Structured diagnostic framework for crawl issues, canonicalization errors, indexation problems, and rendering failures.

针对爬取问题、规范化URL错误、索引问题和渲染失败的结构化诊断框架。

Diagnostic Approach

诊断方法

Technical SEO problems fall into four categories. Diagnose in this order — each layer depends on the previous one working correctly:

Crawlability — Can search engines find and access the pages?
Indexability — Are the pages allowed to be indexed?
Renderability — Can search engines see the full content?
Signals — Are the right signals (titles, structured data, links) in place?

技术性SEO问题可分为四大类。请按照以下顺序诊断——每一层都依赖于上一层正常运行：

可爬取性（Crawlability） —— 搜索引擎能否找到并访问页面？
可索引性（Indexability） —— 页面是否允许被索引？
可渲染性（Renderability） —— 搜索引擎能否看到完整内容？
信号（Signals） —— 是否配置了正确的信号（标题、结构化数据、链接）？

Layer 1: Crawlability

第一层：可爬取性

Check these in order:

请按以下顺序检查：

robots.txt

Fetch
```
[domain]/robots.txt
```
and review the rules
Look for overly broad
```
Disallow
```
rules blocking important paths
Verify
```
Sitemap:
```
directive points to the correct sitemap URL
Check for different rules per user-agent (Googlebot vs others)

Common mistakes:

```
Disallow: /
```
blocking the entire site (often left from staging)
Blocking CSS/JS files that Googlebot needs for rendering
Blocking API or AJAX endpoints that load dynamic content
Staging robots.txt accidentally deployed to production

拉取
```
[domain]/robots.txt
```
并检查规则
查找是否存在过于宽泛的
```
Disallow
```
规则屏蔽了重要路径
验证
```
Sitemap:
```
指令指向正确的站点地图URL
检查是否针对不同user-agent（Googlebot与其他爬虫）设置了不同规则

常见错误：

```
Disallow: /
```
屏蔽了整个站点（通常是测试环境遗留配置）
屏蔽了Googlebot渲染所需的CSS/JS文件
屏蔽了加载动态内容的API或AJAX接口
测试环境的robots.txt意外部署到生产环境

XML Sitemap

Fetch the sitemap URL(s) and check:
- Does it return 200? Is it valid XML?
- Does it list all important pages?
- Does it exclude pages that shouldn't be indexed (404s, redirects, noindex pages)?
- Are
```
<lastmod>
```
  dates accurate and recent?
- For large sites: is there a sitemap index?

拉取站点地图URL并检查：
- 是否返回200状态码？是否为合法XML格式？
- 是否列出了所有重要页面？
- 是否排除了不应被索引的页面（404页面、重定向页面、noindex页面）？
- ```
<lastmod>
```
  日期是否准确且为最近更新？
- 大型站点是否配置了站点地图索引？

Site Architecture

站点架构

Pages should be reachable within 3 clicks from the homepage
Check for orphan pages (no internal links pointing to them)
Check for redirect chains (page A → B → C — should be A → C)
Check for redirect loops

所有页面应能从首页点击3次以内到达
检查是否存在孤儿页面（没有内部链接指向的页面）
检查是否存在重定向链（页面A → B → C，应改为A → C）
检查是否存在重定向循环

Server Response

服务器响应

Do all important pages return HTTP 200?
Check for unexpected 301/302 redirects
Check for soft 404s (page returns 200 but shows "not found" content)
Verify HTTPS is enforced (HTTP should 301 to HTTPS)

所有重要页面是否返回HTTP 200状态码？
检查是否存在意外的301/302重定向
检查是否存在软404（页面返回200状态码但展示“未找到”内容）
验证是否强制使用HTTPS（HTTP应301重定向到HTTPS）

Layer 2: Indexability

第二层：可索引性

Meta Robots / X-Robots-Tag

Check for
```
<meta name="robots" content="noindex">
```
on pages that should be indexed
Check HTTP headers for
```
X-Robots-Tag: noindex
```
Common cause: CMS accidentally applying noindex to pagination, tag pages, or new pages

检查应被索引的页面是否存在
```
<meta name="robots" content="noindex">
```
检查HTTP头是否存在
```
X-Robots-Tag: noindex
```
常见原因：CMS意外给分页页、标签页或新页面添加了noindex规则

Canonical Tags

规范标签（Canonical Tags）

Every page should have a
```
<link rel="canonical">
```
pointing to itself (self-referencing canonical)
Check for canonical tags pointing to wrong pages (common in paginated content, filtered URLs)
Check for conflicting signals: canonical says page A, but noindex is set, or the page redirects

Canonical diagnosis checklist:

Does the canonical URL match the actual URL?
Is the canonical URL accessible (returns 200)?
Does the canonical URL have the same content?
Is there only one canonical tag on the page?

每个页面都应有指向自身的
```
<link rel="canonical">
```
（自引用规范标签）
检查是否存在指向错误页面的规范标签（常见于分页内容、筛选URL）
检查是否存在冲突信号：规范标签指向页面A，但页面设置了noindex，或页面本身会重定向

规范标签诊断检查清单：

规范URL是否与实际URL匹配？
规范URL是否可访问（返回200状态码）？
规范URL是否与当前页面内容一致？
页面上是否只有一个规范标签？

Duplicate Content

重复内容

Check for the same content accessible at multiple URLs:
- With and without trailing slash (
```
/page
```
  vs
```
/page/
```
  )
- With and without
```
www
```
  (
```
example.com
```
  vs
```
www.example.com
```
  )
- HTTP vs HTTPS
- URL parameters creating duplicate pages (
```
?sort=price
```
  ,
```
?page=1
```
  )
Each duplicate set needs one canonical URL; all others should redirect or use canonical tags

检查同一内容是否可通过多个URL访问：
- 带与不带尾部斜杠（
```
/page
```
  与
```
/page/
```
  ）
- 带与不带
```
www
```
  （
```
example.com
```
  与
```
www.example.com
```
  ）
- HTTP与HTTPS
- URL参数生成重复页面（
```
?sort=price
```
  、
```
?page=1
```
  ）
每组重复内容需指定一个规范URL；其他所有URL应重定向或使用规范标签

Layer 3: Renderability

第三层：可渲染性

JavaScript Rendering

JavaScript渲染

Does the page content appear in the raw HTML source? Or is it loaded via JavaScript?
If JS-rendered: does Googlebot see the full content? (Use URL Inspection tool in Search Console)
Check for content hidden behind click events, tabs, or accordions
Check for lazy-loaded content that only appears on scroll

页面内容是否出现在原始HTML源码中？还是通过JavaScript加载？
如果是JS渲染：Googlebot能否看到完整内容？（使用Search Console的URL检查工具）
检查是否存在隐藏在点击事件、标签页、折叠面板后的内容
检查是否存在仅滚动时才加载的懒加载内容

Core Content Visibility

核心内容可见性

Is the main content in the initial HTML? Or loaded async after page load?
Are important elements (titles, headings, product details) in the DOM on first render?
Check for content that requires login or cookies to view

主内容是否存在于初始HTML中？还是页面加载后异步加载？
重要元素（标题、heading、产品详情）是否在首次渲染时就存在于DOM中？
检查是否存在需要登录或Cookie才能查看的内容

Layer 4: Signals

第四层：信号

Title Tags

标题标签

Every page has a unique
```
<title>
```
Title includes the primary keyword
Under 60 characters (to avoid truncation in SERPs)
Descriptive and click-worthy

每个页面都有唯一的
```
<title>
```
标题包含核心关键词
长度不超过60字符（避免在SERP中被截断）
描述清晰且有吸引力

Meta Descriptions

元描述

Every important page has a meta description
150-160 characters
Includes target keyword and a value proposition
Unique per page

每个重要页面都有元描述
长度150-160字符
包含目标关键词和价值主张
每个页面唯一

Heading Structure

标题结构

One H1 per page containing the primary keyword
Logical heading hierarchy (H1 → H2 → H3, no skips)
Headings describe section content (not decorative)

每个页面只有一个包含核心关键词的H1
逻辑清晰的标题层级（H1 → H2 → H3，不跳级）
标题描述板块内容（非装饰性）

Structured Data

结构化数据

Check for JSON-LD structured data appropriate to the page type
Validate with Google's Rich Results Test
Common types: Article, Product, FAQ, HowTo, BreadcrumbList, Organization

检查是否配置了与页面类型匹配的JSON-LD结构化数据
使用Google富媒体结果测试工具验证
常见类型：Article、Product、FAQ、HowTo、BreadcrumbList、Organization

Hreflang (multilingual sites)

Hreflang（多语言站点）

Check for correct
```
hreflang
```
tags linking language variants
Verify reciprocal tags (page A points to B, B points back to A)
Check for
```
x-default
```
tag

检查是否配置了正确的
```
hreflang
```
标签链接不同语言版本
验证双向标签（页面A指向B，B也指向A）
检查是否配置了
```
x-default
```
标签

Output Format

输出格式

Technical SEO Diagnosis: [domain]

技术性SEO诊断：[domain]

Summary

Critical issues: [count]
Warnings: [count]
Passed checks: [count]

Findings by Layer

For each issue found:

Layer	Issue	Severity	Affected Pages	Fix
Crawlability	robots.txt blocks /blog/	Critical	All blog pages	Remove `Disallow: /blog/` from robots.txt
Indexability	Missing canonical tags	Warning	15 pages	Add self-referencing canonicals
...	...	...	...	...

Priority Fix List

Ordered by impact:

[Critical fix] — affects [n] pages, blocks [crawling/indexing/ranking]
[Warning fix] — affects [n] pages, reduces [signal quality]
...

Pro Tip: Run the free SEO Audit for a quick technical check, the Broken Link Checker to find dead links, and the Robots.txt Generator to fix crawl directives. SEOJuice MCP users can run
/seojuice:site-health
for a full technical report and
/seojuice:page-audit [domain] [url]
to drill into specific pages.

摘要

严重问题：[count]
警告：[count]
通过检查项：[count]

按层分类的发现

针对每个发现的问题：

层级	问题	严重程度	受影响页面	修复方案
可爬取性	robots.txt屏蔽了/blog/路径	严重	所有博客页面	从robots.txt中删除 `Disallow: /blog/` 规则
可索引性	缺失规范标签	警告	15个页面	添加自引用规范标签
...	...	...	...	...

优先级修复列表

按影响排序：

[严重修复项] —— 影响 [n] 个页面，阻碍 [爬取/索引/排名]
[警告修复项] —— 影响 [n] 个页面，降低 [信号质量]
...

专业提示： 运行免费的SEO Audit进行快速技术检查，使用Broken Link Checker查找死链，使用Robots.txt Generator修复爬取指令。SEOJuice MCP用户可运行
/seojuice:site-health
获取完整技术报告，运行
/seojuice:page-audit [domain] [url]
深入排查特定页面问题。