seo-technical

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Technical SEO

Technical SEO

Audit and fix the layer beneath the content: how search engines crawl, render, index, and trust a site. Stack-agnostic.

审计并修复内容之下的底层问题:搜索引擎如何抓取、渲染、索引和信任网站。与技术栈无关。

When to use

使用场景

  • Site-wide audit before or after a migration
  • Investigating indexing or ranking drops
  • Setting up SEO foundations on a new site
  • Auditing Core Web Vitals or page experience signals
  • Fixing crawl waste, redirect chains, or canonical issues
  • Setting up multilingual or multi-regional sites
  • 网站迁移前后的全站审计
  • 调查索引或排名下降问题
  • 为新网站搭建SEO基础
  • 审计Core Web Vitals或页面体验信号
  • 修复抓取浪费、重定向链或规范标签问题
  • 搭建多语言或多区域站点

When NOT to use

不适用场景

  • Single-page on-page optimization (use
    seo-onpage
    )
  • Keyword strategy or content planning (use
    seo-keyword
    )
  • Competitor backlink or SERP analysis (use
    seo-competitor
    )
  • Pure performance optimization without SEO context (use
    performance-optimization
    )

  • 单页面的页面内优化(使用
    seo-onpage
  • 关键词策略或内容规划(使用
    seo-keyword
  • 竞品反向链接或SERP分析(使用
    seo-competitor
  • 脱离SEO语境的纯性能优化(使用
    performance-optimization

Required inputs

必要输入

  • The site URL or staging URL
  • Access to (at minimum) view the rendered HTML, robots.txt, and sitemap
  • Ideally: search console access, server logs, and a crawler
If the site is large (10K+ URLs), confirm whether the audit is a full crawl or a sample.

  • 网站URL或预发布环境URL
  • 至少具备查看渲染后的HTML、robots.txt和sitemap的权限
  • 理想情况:拥有Search Console权限、服务器日志和爬虫工具
如果网站规模较大(10K+个URL),请确认审计是全量抓取还是抽样抓取。

The framework: 6 layers

框架:6个层级

Technical SEO has six layers, stacked. A failure in a lower layer breaks everything above it.
Technical SEO包含6个层级,层层堆叠。底层出现问题会导致所有上层功能失效。

1. Crawlability

1. 可抓取性

Can search engines access the URLs?
  • robots.txt does not block important paths
  • No accidental
    noindex
    on indexable pages
  • No accidental
    disallow
    patterns blocking CSS or JS (rendering breaks)
  • Sitemap is present, returns 200, and lists canonical URLs only
  • Sitemap is referenced in robots.txt
  • No infinite spaces (faceted nav generating endless URLs)
  • Crawl budget is not wasted on low-value URLs
搜索引擎能否访问URL?
  • robots.txt未拦截重要路径
  • 可索引页面未被意外设置
    noindex
  • 未通过
    disallow
    规则意外拦截CSS或JS(否则会破坏渲染)
  • 存在sitemap,返回200状态码,且仅列出规范URL
  • sitemap已在robots.txt中引用
  • 不存在无限空间(如分面导航生成无限URL)
  • 抓取预算未浪费在低价值URL上

2. Indexability

2. 可索引性

Of crawlable URLs, which should be indexed?
  • One canonical URL per piece of content (no duplicates)
  • Canonical tags self-reference on canonical pages
  • noindex
    on staging, search results, filter pages, thank-you pages, internal admin
  • No mixed signals (canonical pointing one way, sitemap another, internal links a third)
  • Pagination handled correctly (rel=next/prev is deprecated, but consistent canonicals matter)
  • Parameter handling deliberate (UTM, session IDs, sort orders)
在可抓取的URL中,哪些应该被索引?
  • 每个内容块对应一个规范URL(无重复内容)
  • 规范页面上的规范标签指向自身
  • 预发布环境、搜索结果页、筛选页、感谢页、内部管理页设置
    noindex
  • 无混合信号(规范标签、sitemap、内部链接指向不一致)
  • 分页处理正确(rel=next/prev已弃用,但规范标签一致性很重要)
  • 参数处理明确(UTM、会话ID、排序规则)

3. Rendering

3. 渲染

Does the rendered HTML match what crawlers see?
  • Critical content visible without JavaScript (or properly server-rendered)
  • For SPAs: confirm Googlebot sees the rendered content (test with the URL Inspection tool)
  • No cloaking (showing different content to bots vs users)
  • Lazy-loaded content has proper loading attributes
  • Hydration errors do not strip content from the rendered DOM
渲染后的HTML是否与爬虫所见一致?
  • 关键内容无需JavaScript即可显示(或已正确实现服务端渲染)
  • 对于SPA:确认Googlebot能看到渲染后的内容(使用URL Inspection工具测试)
  • 无伪装行为(向爬虫和用户展示不同内容)
  • 懒加载内容具备正确的加载属性
  • hydration错误不会从渲染后的DOM中移除内容

4. Site architecture

4. 网站架构

Is the site structured for both users and crawlers?
  • Clear URL hierarchy that mirrors site structure
  • Important pages reachable in 3 clicks or fewer from the homepage
  • Internal linking distributes authority logically
  • Breadcrumb navigation present and marked up with schema
  • No orphan pages (pages with no internal links)
  • No redirect chains (one redirect max)
  • No 4xx errors on internally-linked URLs
网站结构是否同时适配用户和爬虫?
  • 清晰的URL层级与网站结构匹配
  • 重要页面从首页出发最多3次点击即可到达
  • 内部链接合理分配权重
  • 存在面包屑导航并使用Schema标记
  • 无孤立页面(无内部链接指向的页面)
  • 无重定向链(最多一次重定向)
  • 内部链接指向的URL无4xx错误

5. Structured data and signals

5. 结构化数据与信号

Does the site speak crawler language?
  • Schema.org markup on appropriate page types
  • JSON-LD format (preferred over microdata)
  • Validates in the Rich Results Test
  • Organization or LocalBusiness schema on the homepage or about page
  • BreadcrumbList schema on nested pages
  • Author and publisher schema linked correctly on content pages
  • llms.txt present at the root (for AI crawlers, see
    seo-aeo-geo
    )
网站是否使用爬虫能理解的语言?
  • 对应页面类型已添加Schema.org标记
  • 使用JSON-LD格式(优先于微数据)
  • 通过富结果测试验证有效性
  • 首页或关于页添加Organization或LocalBusiness Schema
  • 嵌套页面添加BreadcrumbList Schema
  • 内容页正确关联作者和发布者Schema
  • 根目录存在llms.txt(供AI爬虫使用,参见
    seo-aeo-geo

6. Page experience and security

6. 页面体验与安全性

Does the site meet the page experience baseline?
  • HTTPS on all pages, no mixed content
  • HSTS header set
  • Core Web Vitals pass (LCP, INP, CLS within thresholds)
  • Mobile-friendly (responsive, no horizontal scroll, tap targets sized correctly)
  • No intrusive interstitials on mobile
  • Stable URL structure (no random URL changes between deploys)
  • 404 pages return 404, not 200 with "page not found" content (soft 404)

网站是否符合页面体验基准?
  • 所有页面使用HTTPS,无混合内容
  • 设置HSTS标头
  • 核心Web指标(Core Web Vitals)达标(LCP、INP、CLS在阈值范围内)
  • 移动端友好(响应式布局、无横向滚动、点击目标尺寸合适)
  • 移动端无侵入式弹窗
  • URL结构稳定(部署之间无随机URL变更)
  • 404页面返回404状态码,而非返回200状态码并显示“页面未找到”内容(软404)

Workflow

工作流程

  1. Define scope. Whole site, a subfolder, a migration check, or a specific issue.
  2. Confirm access. What can you actually see (HTML, robots, sitemap, search console, server logs, staging)?
  3. Crawl. Use a crawler to enumerate URLs and statuses. Sample if the site is huge.
  4. Run the 6-layer framework. Score each, note specific issues with example URLs.
  5. Cross-reference. Search console for what's actually indexed. Compare to sitemap and crawl output.
  6. Prioritize. Critical (blocks indexing or causes traffic loss), Important (suboptimal), Nice-to-have (polish).
  7. Write the report. Use the template in
    references/audit-template.md
    .

  1. 定义范围:全站、子文件夹、迁移检查或特定问题。
  2. 确认权限:实际可访问的资源(HTML、robots.txt、sitemap、Search Console、服务器日志、预发布环境)。
  3. 抓取:使用爬虫枚举URL和状态码。若网站规模过大则抽样抓取。
  4. 执行6层级框架:为每个层级评分,记录具体问题及示例URL。
  5. 交叉验证:通过Search Console查看实际已索引内容,与sitemap和抓取结果对比。
  6. 优先级排序:关键问题(阻碍索引或导致流量损失)、重要问题(非最优)、优化建议(锦上添花)。
  7. 撰写报告:使用
    references/audit-template.md
    中的模板。

Failure patterns

常见错误模式

  • Optimizing rankings on a page that is
    noindex
    .
    Always check indexability before content work.
  • Adding sitemaps without fixing canonical issues. A sitemap of duplicate URLs is worse than no sitemap.
  • Blocking crawlers from CSS or JS. Breaks Google's rendering. Common in over-aggressive robots.txt files.
  • Over-relying on canonical tags. Canonicals are hints, not directives. Use redirects when content actually moved.
  • Migrating without a redirect map. Single biggest cause of post-migration traffic loss.
  • Treating Core Web Vitals as the only ranking signal. Page experience matters but does not override relevance.

  • 为设置了
    noindex
    的页面优化排名
    :在进行内容优化前务必先检查可索引性。
  • 未修复规范标签问题就添加sitemap:包含重复URL的sitemap比没有sitemap更糟糕。
  • 拦截爬虫访问CSS或JS:会破坏Google的渲染。常见于过度严格的robots.txt文件。
  • 过度依赖规范标签:规范标签是提示而非指令。当内容实际迁移时应使用重定向。
  • 迁移时未准备重定向映射:这是迁移后流量损失的最大诱因。
  • 将Core Web Vitals视为唯一排名信号:页面体验很重要,但不会覆盖相关性。

Output format

输出格式

Default output is a markdown audit at
seo-technical-audit.md
. Structure:
  1. Scope and methodology
  2. Executive summary (3 to 5 critical findings)
  3. 6-layer score
  4. Critical issues (with example URLs)
  5. Important issues
  6. Nice-to-have polish
  7. Implementation roadmap (sequenced)
For migrations, include a redirect map as a CSV alongside the report.

默认输出为Markdown格式的审计报告,路径为
seo-technical-audit.md
。结构如下:
  1. 范围与方法论
  2. 执行摘要(3-5个关键发现)
  3. 6层级评分
  4. 关键问题(含示例URL)
  5. 重要问题
  6. 优化建议
  7. 实施路线图(按顺序排列)
对于迁移场景,需在报告旁附带CSV格式的重定向映射表。

Reference files

参考文件

  • references/audit-template.md
    - Fillable technical SEO audit template.
  • references/migration-checklist.md
    - Pre and post-migration checklist (covers the highest-risk scenario).
  • references/audit-template.md
    - 可填写的Technical SEO审计模板。
  • references/migration-checklist.md
    - 迁移前后检查清单(涵盖最高风险场景)。