geofeed-tuner

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Geofeed Tuner – Create Better IP Geolocation Feeds

Geofeed调优工具——打造更优质的IP地理定位源

This skill helps you create and improve IP geolocation feeds in CSV format by:

Ensuring your CSV is well-formed and consistent
Checking alignment with RFC 8805 (the industry standard)
Applying opinionated best practices learned from real-world deployments
Suggesting improvements for accuracy, completeness, and privacy

本技能可通过以下方式帮助您创建和优化CSV格式的IP地理定位源：

确保CSV格式规范且一致
检查是否符合RFC 8805（行业标准）
应用从实际部署中总结的经验性最佳实践
为提升准确性、完整性和隐私性提供改进建议

When to Use This Skill

何时使用本技能

Use this skill when a user asks for help creating, improving, or publishing an IP geolocation feed file in CSV format.
Use it to tune and troubleshoot CSV geolocation feeds — catching errors, suggesting improvements, and ensuring real-world usability beyond RFC compliance.
Intended audience:
- Network operators, administrators, and engineers responsible for publicly routable IP address space
- Organizations such as ISPs, mobile carriers, cloud providers, hosting and colocation companies, Internet Exchange operators, and satellite internet providers
Do not use this skill for private or internal IP address management; it applies only to publicly routable IP addresses.

当用户需要帮助创建、改进或发布CSV格式的IP地理定位源文件时，使用本技能。
用于调优和排查CSV地理定位源问题——捕获错误、提出改进建议，确保其在实际场景中的可用性，而不仅仅满足RFC合规要求。
目标受众：
- 负责可公开路由IP地址空间的网络运营商、管理员和工程师
- 各类组织，如ISP、移动运营商、云服务商、托管和 colocation 公司、互联网交换中心运营商以及卫星互联网服务商
请勿用于私有或内部IP地址管理；本技能仅适用于可公开路由的IP地址。

Prerequisites

前置条件

Python 3 is required.

需要安装Python 3。

Directory Structure and File Management

目录结构与文件管理

This skill uses a clear separation between distribution files (read-only) and working files (generated at runtime).

本技能明确区分分发文件（只读）和工作文件（运行时生成）。

Read-Only Directories (Do Not Modify)

只读目录（请勿修改）

The following directories contain static distribution assets. Do not create, modify, or delete files in these directories:

Directory	Purpose
`assets/`	Static data files (ISO codes, examples)
`references/`	RFC specifications and code snippets for reference
`scripts/`	Executable code and HTML template files for reports

以下目录包含静态分发资源。请勿在这些目录中创建、修改或删除文件：

目录	用途
`assets/`	静态数据文件（ISO代码、示例文件等）
`references/`	RFC规范和代码片段参考文件
`scripts/`	可执行代码和报告HTML模板文件

Working Directories (Generated Content)

工作目录（生成内容）

All generated, temporary, and output files go in these directories:

Directory	Purpose
`run/`	Working directory for all agent-generated content
`run/data/`	Downloaded CSV files from remote URLs
`run/report/`	Generated HTML tuning reports

所有生成的临时文件和输出文件都存储在这些目录中：

目录	用途
`run/`	所有Agent生成内容的工作目录
`run/data/`	从远程URL下载的CSV文件存储目录
`run/report/`	生成的HTML调优报告存储目录

File Management Rules

文件管理规则

Never write to
assets/
,
references/
, or
scripts/
— these are part of the skill distribution and must remain unchanged.
All downloaded input files (from remote URLs) must be saved to
```
./run/data/
```
.
All generated HTML reports must be saved to
```
./run/report/
```
.
All generated Python scripts must be saved to
```
./run/
```
.
The
```
run/
```
directory may be cleared between sessions; do not store permanent data there.
Working directory for execution: All generated scripts in
```
./run/
```
must be executed with the skill root directory (the directory containing
```
SKILL.md
```
) as the current working directory, so that relative paths like
```
assets/iso3166-1.json
```
and
```
./run/data/report-data.json
```
resolve correctly. Do not
```
cd
```
into
```
./run/
```
before running scripts.

切勿向
assets/
、
references/
或
scripts/
写入内容——这些是技能分发的一部分，必须保持不变。
所有下载的输入文件（来自远程URL）必须保存到
```
./run/data/
```
。
所有生成的HTML报告必须保存到
```
./run/report/
```
。
所有生成的Python脚本必须保存到
```
./run/
```
。
```
run/
```
目录可能会在会话之间被清空；请勿在此存储永久数据。
执行工作目录：
```
./run/
```
中的所有生成脚本必须以技能根目录（包含
```
SKILL.md
```
的目录）作为当前工作目录执行，这样相对路径如
```
assets/iso3166-1.json
```
和
```
./run/data/report-data.json
```
才能正确解析。执行脚本前请勿切换到
```
./run/
```
目录。

Processing Pipeline: Sequential Phase Execution

处理流程：按阶段顺序执行

All phases must be executed in order, from Phase 1 through Phase 6. Each phase depends on the successful completion of the previous phase. For example, structure checks must complete before quality analysis can run.

The phases are summarized below. The agent must follow the detailed steps outlined further in each phase section.

Phase	Name	Description
1	Understand the Standard	Review the key requirements of RFC 8805 for self-published IP geolocation feeds
2	Gather Input	Collect IP subnet data from local files or remote URLs
3	Checks & Suggestions	Validate CSV structure, analyze IP prefixes, and check data quality
4	Tuning Data Lookup	Use Fastah's MCP tool to retrieve tuning data for improving geolocation accuracy
5	Generate Tuning Report	Create an HTML report summarizing the analysis and suggestions
6	Final Review	Verify consistency and completeness of the report data

Do not skip phases. Each phase provides critical checks or data transformations required by subsequent stages.

所有阶段必须按顺序执行，从第1阶段到第6阶段。每个阶段都依赖于前一阶段的成功完成。例如，结构检查必须在质量分析之前完成。

以下是各阶段的概述。Agent必须遵循每个阶段部分中列出的详细步骤。

阶段	名称	描述
1	理解标准要求	回顾RFC 8805中关于自发布IP地理定位源的关键要求
2	收集输入数据	从本地文件或远程URL收集IP子网数据
3	检查与建议	验证CSV结构、分析IP前缀并检查数据质量
4	调优数据查询	使用Fastah的MCP工具检索调优数据，以提升地理定位准确性
5	生成调优报告	创建HTML报告，总结分析结果和建议
6	最终审核	验证报告数据的一致性和完整性

请勿跳过任何阶段。每个阶段都提供后续阶段所需的关键检查或数据转换。

Execution Plan Rules

执行计划规则

Before executing each phase, the agent MUST generate a visible TODO checklist.

The plan MUST:

Appear at the very start of the phase
List every step in order
Use a checkbox format
Be updated live as steps complete

在执行每个阶段之前，Agent必须生成一个可见的待办事项清单。

计划必须：

出现在阶段的最开始
按顺序列出每个步骤
使用复选框格式
随着步骤完成实时更新

Phase 1: Understand the Standard

阶段1：理解标准要求

The key requirements from RFC 8805 that this skill enforces are summarized below. Use this summary as your working reference. Only consult the full RFC 8805 text for edge cases, ambiguous situations, or when the user asks a standards question not covered here.

以下是本技能强制执行的RFC 8805关键要求摘要。请将此摘要作为工作参考。仅在遇到边缘情况、模糊场景或用户提出此处未涵盖的标准相关问题时，才查阅完整的RFC 8805文本。

RFC 8805 Key Facts

RFC 8805关键要点

Purpose: A self-published IP geolocation feed lets network operators publish authoritative location data for their IP address space in a simple CSV format, allowing geolocation providers to incorporate operator-supplied corrections.

CSV Column Order (Sections 2.1.1.1–2.1.1.5):


ip_prefix
alpha2code
US-CA
city
postal_code

Column	Field	Required	Notes
1	`ip_prefix`	Yes	CIDR notation; IPv4 or IPv6; must be a network address
2	`alpha2code`	No	ISO 3166-1 alpha-2 country code; empty or "ZZ" = do-not-geolocate
3	`region`	No	ISO 3166-2 subdivision code (e.g., `US-CA` )
4	`city`	No	Free-text city name; no authoritative validation set
5	`postal_code`	No	Deprecated — must be left empty or absent

Structural rules:

Files may contain comment lines beginning with
```
#
```
(including the header, if present).
A header row is optional; if present, it is treated as a comment if it starts with
```
#
```
.
Files must be encoded in UTF-8.
Subnet host bits must not be set (i.e.,
```
192.168.1.1/24
```
is invalid; use
```
192.168.1.0/24
```
).
Applies only to globally routable unicast addresses — not private, loopback, link-local, or multicast space.

Do-not-geolocate: An entry with an empty

alpha2code

or case-insensitive

ZZ

(irrespective of values of region/city) is an explicit signal that the operator does not want geolocation applied to that prefix.

Postal codes deprecated (Section 2.1.1.5): The fifth column must not contain postal or ZIP codes. They are too fine-grained for IP-range mapping and raise privacy concerns.

**目的：**自发布IP地理定位源允许网络运营商以简单的CSV格式发布其IP地址空间的权威位置数据，以便地理定位提供商纳入运营商提供的修正信息。

CSV列顺序（第2.1.1.1–2.1.1.5节）：


ip_prefix
alpha2code
US-CA
city
postal_code

列号	字段名	是否必填	说明
1	`ip_prefix`	是	CIDR表示法；支持IPv4或IPv6；必须是网络地址
2	`alpha2code`	否	ISO 3166-1 alpha-2国家代码；空值或"ZZ"表示无需地理定位
3	`region`	否	ISO 3166-2细分代码（例如 `US-CA` ）
4	`city`	否	自由文本城市名称；无权威验证集
5	`postal_code`	否	已弃用——必须留空或省略该列

结构规则：

文件中可包含以
```
#
```
开头的注释行（包括可能存在的表头）。
表头行是可选的；如果存在，若以
```
#
```
开头则会被视为注释行。
文件必须以UTF-8编码。
子网主机位不得设置（例如
```
192.168.1.1/24
```
无效；应使用
```
192.168.1.0/24
```
）。
仅适用于全局可路由的单播地址——不适用于私有、环回、链路本地或多播地址段。

**无需地理定位的标记：**如果

alpha2code

为空或不区分大小写的

ZZ

（无论region/city的值如何），则明确表示运营商不希望对该前缀进行地理定位。

**邮政编码已弃用（第2.1.1.5节）：**第五列不得包含邮政编码或ZIP代码。它们对于IP范围映射来说粒度太细，且存在隐私问题。

Phase 2: Gather Input

阶段2：收集输入数据

If the user has not already provided a list of IP subnets or ranges (sometimes referred to as
```
inetnum
```
or
```
inet6num
```
), prompt them to supply it. Accepted input formats:
- Text pasted into the chat
- A local CSV file
- A remote URL pointing to a CSV file
If the input is a remote URL:
- Attempt to download the CSV file to
```
./run/data/
```
  before processing.
- On HTTP error (4xx, 5xx, timeout, or redirect loop), stop immediately and report to the user:
```
Feed URL is not reachable: HTTP {status_code}. Please verify the URL is publicly accessible.
```
- Do not proceed to Phase 3 with an incomplete or empty download.
If the input is a local file, process it directly without downloading.
Encoding detection and normalization:
1. Attempt to read the file as UTF-8 first.
2. If a
```
UnicodeDecodeError
```
  is raised, try
```
utf-8-sig
```
  (UTF-8 with BOM), then
```
latin-1
```
  .
3. Once successfully decoded, re-encode and write the working copy as UTF-8.
4. If no encoding succeeds, stop and report:
```
Unable to decode input file. Please save it as UTF-8 and try again.
```

如果用户尚未提供IP子网或范围列表（有时称为
```
inetnum
```
或
```
inet6num
```
），请提示他们提供。支持的输入格式：
- 粘贴到聊天框中的文本
- 本地CSV文件
- 指向CSV文件的远程URL
如果输入是远程URL：
- 处理前先尝试将CSV文件下载到
```
./run/data/
```
  。
- 如果遇到HTTP错误（4xx、5xx、超时或重定向循环），立即停止并向用户报告：
```
源URL无法访问：HTTP {status_code}。请验证该URL是否可公开访问。
```
- 若下载不完整或为空，请勿进入阶段3。
如果输入是本地文件，直接处理无需下载。
编码检测与标准化：
1. 首先尝试以UTF-8编码读取文件。
2. 如果引发
```
UnicodeDecodeError
```
  ，尝试
```
utf-8-sig
```
  （带BOM的UTF-8），再尝试
```
latin-1
```
  。
3. 成功解码后，重新编码为UTF-8并写入工作副本。
4. 如果所有编码都无法成功解码，停止操作并报告：
```
无法解码输入文件。请将其保存为UTF-8格式后重试。
```

Phase 3: Checks & Suggestions

阶段3：检查与建议

Execution Rules

执行规则

Generate a script for this phase.
Do NOT combine this phase with others.
Do NOT precompute future-phase data.
Store the output as a JSON file at:
```
./run/data/report-data.json
```

为此阶段生成一个脚本。
请勿将此阶段与其他阶段合并。
请勿预先计算后续阶段的数据。
将输出存储为JSON文件，路径为：
```
./run/data/report-data.json
```

Schema Definition

模式定义

The JSON structure below is IMMUTABLE during Phase 3. Phase 4 will later add a

TunedEntry

object to each object in

Entries

— this is the only permitted schema extension and happens in a separate phase.

JSON keys map directly to template placeholders like

{{.CountryCode}}

{{.HasError}}

, etc.

json

{
  "InputFile": "",
  "Timestamp": 0,

  "TotalEntries": 0,
  "IpV4Entries": 0,
  "IpV6Entries": 0,
  "InvalidEntries": 0,

  "Errors": 0,
  "Warnings": 0,
  "OK": 0,
  "Suggestions": 0,

  "CityLevelAccuracy": 0,
  "RegionLevelAccuracy": 0,
  "CountryLevelAccuracy": 0,
  "DoNotGeolocate": 0,

  "Entries": [
    {
      "Line": 0,
      "IPPrefix": "",
      "CountryCode": "",
      "RegionCode": "",
      "City": "",

      "Status": "",
      "IPVersion": "",

      "Messages": [
        {
          "ID": "",
          "Type": "",
          "Text": "",
          "Checked": false
        }
      ],

      "HasError": false,
      "HasWarning": false,
      "HasSuggestion": false,
      "DoNotGeolocate": false,
      "GeocodingHint": "",
      "Tunable": false
    }
  ]
}

Field definitions:

Top-level metadata:

```
InputFile
```
: The original input source, either a local filename or a remote URL.
```
Timestamp
```
: Milliseconds since Unix epoch when the tuning was performed.
```
TotalEntries
```
: Total number of data rows processed (excluding comment and blank lines).
```
IpV4Entries
```
: Count of entries that are IPv4 subnets.
```
IpV6Entries
```
: Count of entries that are IPv6 subnets.
```
InvalidEntries
```
: Count of entries that failed IP prefix parsing and CSV parsing.
```
Errors
```
: Total entries whose
```
Status
```
is
```
ERROR
```
.
```
Warnings
```
: Total entries whose
```
Status
```
is
```
WARNING
```
.
```
OK
```
: Total entries whose
```
Status
```
is
```
OK
```
.
```
Suggestions
```
: Total entries whose
```
Status
```
is
```
SUGGESTION
```
.
```
CityLevelAccuracy
```
: Count of valid entries where
```
City
```
is non-empty.
```
RegionLevelAccuracy
```
: Count of valid entries where
```
RegionCode
```
is non-empty and
```
City
```
is empty.
```
CountryLevelAccuracy
```
: Count of valid entries where
```
CountryCode
```
is non-empty,
```
RegionCode
```
is empty, and
```
City
```
is empty.
```
DoNotGeolocate
```
(metadata): Count of valid entries where
```
CountryCode
```
,
```
RegionCode
```
, and
```
City
```
are all empty.

Entry fields:

```
Entries
```
: Array of objects, one per data row, with the following per-entry fields:
- ```
Line
```
  : 1-based line number in the original CSV (counting all lines including comments and blanks).
- ```
IPPrefix
```
  : The normalized IP prefix in CIDR slash notation.
- ```
CountryCode
```
  : The ISO 3166-1 alpha-2 country code, or empty string.
- ```
RegionCode
```
  : The ISO 3166-2 region code (e.g.,
```
US-CA
```
  ), or empty string.
- ```
City
```
  : The city name, or empty string.
- ```
Status
```
  : Highest severity assigned:
```
ERROR
```
  >
```
WARNING
```
  >
```
SUGGESTION
```
  >
```
OK
```
  .
- ```
IPVersion
```
  :
```
"IPv4"
```
  or
```
"IPv6"
```
  based on the parsed IP prefix.
- ```
Messages
```
  : Array of message objects, each with:
  - ```
  ID
```
  : String identifier from the Validation Rules Reference table below (e.g.,
```
  "1101"
```
  ,
```
  "3301"
```
  ).
- ```
Type
```
    : The severity type:
```
"ERROR"
```
    ,
```
"WARNING"
```
    , or
```
"SUGGESTION"
```
    .
  - ```
  Text
```
  : The human-readable validation message string.
- ```
Checked
```
    :
```
true
```
    if the validation rule is auto-tunable (
```
Tunable: true
```
    in the reference table),
```
false
```
    otherwise. Controls whether the checkbox in the report is
```
checked
```
    or
```
disabled
```
    .
- ```
HasError
```
  :
```
true
```
  if any message has
```
Type
```
```
"ERROR"
```
  .
- ```
HasWarning
```
  :
```
true
```
  if any message has
```
Type
```
```
"WARNING"
```
  .
- ```
HasSuggestion
```
  :
```
true
```
  if any message has
```
Type
```
```
"SUGGESTION"
```
  .
- ```
DoNotGeolocate
```
  (entry):
```
true
```
  if
```
CountryCode
```
  is empty or
```
"ZZ"
```
  — the entry is an explicit do-not-geolocate signal.
- ```
GeocodingHint
```
  : Always empty string
```
""
```
  in Phase 3. Reserved for future use.
- ```
Tunable
```
  :
```
true
```
  if any message in the entry has
```
Checked: true
```
  . Computed as logical OR across all messages'
```
Checked
```
  values. This flag drives the "Tune" button visibility in the report.

以下JSON结构在阶段3中是不可修改的。阶段4将在

Entries

中的每个对象中添加一个

TunedEntry

对象——这是唯一允许的模式扩展，且将在单独阶段中完成。

JSON键直接映射到模板占位符，如

{{.CountryCode}}

、

{{.HasError}}

等。

json

{
  "InputFile": "",
  "Timestamp": 0,

  "TotalEntries": 0,
  "IpV4Entries": 0,
  "IpV6Entries": 0,
  "InvalidEntries": 0,

  "Errors": 0,
  "Warnings": 0,
  "OK": 0,
  "Suggestions": 0,

  "CityLevelAccuracy": 0,
  "RegionLevelAccuracy": 0,
  "CountryLevelAccuracy": 0,
  "DoNotGeolocate": 0,

  "Entries": [
    {
      "Line": 0,
      "IPPrefix": "",
      "CountryCode": "",
      "RegionCode": "",
      "City": "",

      "Status": "",
      "IPVersion": "",

      "Messages": [
        {
          "ID": "",
          "Type": "",
          "Text": "",
          "Checked": false
        }
      ],

      "HasError": false,
      "HasWarning": false,
      "HasSuggestion": false,
      "DoNotGeolocate": false,
      "GeocodingHint": "",
      "Tunable": false
    }
  ]
}

字段定义：

顶层元数据：

```
InputFile
```
：原始输入源，可为本地文件名或远程URL。
```
Timestamp
```
：调优执行时的Unix时间戳（毫秒）。
```
TotalEntries
```
：处理的数据总行数（不包括注释行和空行）。
```
IpV4Entries
```
：IPv4子网条目的数量。
```
IpV6Entries
```
：IPv6子网条目的数量。
```
InvalidEntries
```
：无法解析IP前缀和CSV格式的条目数量。
```
Errors
```
：
```
Status
```
为
```
ERROR
```
的条目总数。
```
Warnings
```
：
```
Status
```
为
```
WARNING
```
的条目总数。
```
OK
```
：
```
Status
```
为
```
OK
```
的条目总数。
```
Suggestions
```
：
```
Status
```
为
```
SUGGESTION
```
的条目总数。
```
CityLevelAccuracy
```
：
```
City
```
字段非空的有效条目数量。
```
RegionLevelAccuracy
```
：
```
RegionCode
```
非空且
```
City
```
为空的有效条目数量。

CountryLevelAccuracy

：

CountryCode

非空且

RegionCode

和

City

为空的有效条目数量。

```
DoNotGeolocate
```
：标记为无需地理定位的有效条目数量。

条目字段：

```
Entries
```
：对象数组，每个对象对应一行数据，包含以下字段：
- ```
Line
```
  ：原始CSV中的行号（从1开始计数，包括所有行，如注释行和空行）。
- ```
IPPrefix
```
  ：标准化后的IP前缀（CIDR斜杠表示法）。
- ```
CountryCode
```
  ：ISO 3166-1 alpha-2国家代码，或空字符串。
- ```
RegionCode
```
  ：ISO 3166-2地区代码（例如
```
US-CA
```
  ），或空字符串。
- ```
City
```
  ：城市名称，或空字符串。
- ```
Status
```
  ：分配的最高严重级别：
```
ERROR
```
  >
```
WARNING
```
  >
```
SUGGESTION
```
  >
```
OK
```
  。
- ```
IPVersion
```
  ：根据解析的IP前缀确定为
```
"IPv4"
```
  或
```
"IPv6"
```
  。
- ```
Messages
```
  ：消息对象数组，每个对象包含：
  - ```
  ID
```
  ：来自下方验证规则参考表的字符串标识符（例如
```
  "1101"
```
  、
```
  "3301"
```
  ）。
- ```
Type
```
    ：严重级别类型：
```
"ERROR"
```
    、
```
"WARNING"
```
    或
```
"SUGGESTION"
```
    。
  - ```
  Text
```
  ：人类可读的验证消息字符串。
- ```
Checked
```
    ：如果验证规则可自动调优（参考表中
```
Tunable: true
```
    ）则为
```
true
```
    ，否则为
```
false
```
    。控制报告中复选框是否为
```
checked
```
    或
```
disabled
```
    状态。
- ```
HasError
```
  ：如果任何消息的
```
Type
```
  为
```
"ERROR"
```
  则为
```
true
```
  。
- ```
HasWarning
```
  ：如果任何消息的
```
Type
```
  为
```
"WARNING"
```
  则为
```
true
```
  。
- ```
HasSuggestion
```
  ：如果任何消息的
```
Type
```
  为
```
"SUGGESTION"
```
  则为
```
true
```
  。
- ```
DoNotGeolocate
```
  （条目级）：如果
```
CountryCode
```
  为空或
```
"ZZ"
```
  则为
```
true
```
  ——表示该条目明确标记为无需地理定位。
- ```
GeocodingHint
```
  ：阶段3中始终为空字符串
```
""
```
  。预留供后续使用。
- ```
Tunable
```
  ：如果条目中任何消息的
```
Checked
```
  为
```
true
```
  则为
```
true
```
  。通过所有消息的
```
Checked
```
  值的逻辑或运算得出。该标志控制报告中“调优”按钮的可见性。

Validation Rules Reference

验证规则参考表

When adding messages to an entry, use the

ID

Type

Text

, and

Checked

values from this table.

ID	Type	Text	Checked	Condition Reference
`1101`	`ERROR`	IP prefix is empty	`false`	IP Prefix Analysis: empty
`1102`	`ERROR`	Invalid IP prefix: unable to parse as IPv4 or IPv6 network	`false`	IP Prefix Analysis: invalid syntax
`1103`	`ERROR`	Non-public IP range is not allowed in an RFC 8805 feed	`false`	IP Prefix Analysis: non-public
`3101`	`SUGGESTION`	IPv4 prefix is unusually large and may indicate a typo	`false`	IP Prefix Analysis: IPv4 < /22
`3102`	`SUGGESTION`	IPv6 prefix is unusually large and may indicate a typo	`false`	IP Prefix Analysis: IPv6 < /64
`1201`	`ERROR`	Invalid country code: not a valid ISO 3166-1 alpha-2 value	`true`	Country Code Analysis: invalid
`1301`	`ERROR`	Invalid region format; expected COUNTRY-SUBDIVISION (e.g., US-CA)	`true`	Region Code Analysis: bad format
`1302`	`ERROR`	Invalid region code: not a valid ISO 3166-2 subdivision	`true`	Region Code Analysis: unknown code
`1303`	`ERROR`	Region code does not match the specified country code	`true`	Region Code Analysis: mismatch
`1401`	`ERROR`	Invalid city name: placeholder value is not allowed	`false`	City Name Analysis: placeholder
`1402`	`ERROR`	Invalid city name: abbreviated or code-based value detected	`true`	City Name Analysis: abbreviation
`2401`	`WARNING`	City name formatting is inconsistent; consider normalizing the value	`true`	City Name Analysis: formatting
`1501`	`ERROR`	Postal codes are deprecated by RFC 8805 and must be removed for privacy reasons	`true`	Postal Code Check
`3301`	`SUGGESTION`	Region is usually unnecessary for small territories; consider removing the region value	`true`	Tuning: small territory region
`3402`	`SUGGESTION`	City-level granularity is usually unnecessary for small territories; consider removing the city value	`true`	Tuning: small territory city
`3303`	`SUGGESTION`	Region code is recommended when a city is specified; choose a region from the dropdown	`true`	Tuning: missing region with city
`3104`	`SUGGESTION`	Confirm whether this subnet is intentionally marked as do-not-geolocate or missing location data	`true`	Tuning: unspecified geolocation

向条目添加消息时，请使用此表中的

ID

、

Type

、

Text

和

Checked

值。

ID	类型	消息文本	Checked	条件参考
`1101`	`ERROR`	IP前缀为空	`false`	IP前缀分析：空值
`1102`	`ERROR`	无效IP前缀：无法解析为IPv4或IPv6网络地址	`false`	IP前缀分析：语法无效
`1103`	`ERROR`	RFC 8805源中不允许使用非公开IP地址段	`false`	IP前缀分析：非公开地址
`3101`	`SUGGESTION`	IPv4前缀过大，可能存在输入错误	`false`	IP前缀分析：IPv4前缀小于/22
`3102`	`SUGGESTION`	IPv6前缀过大，可能存在输入错误	`false`	IP前缀分析：IPv6前缀小于/64
`1201`	`ERROR`	无效国家代码：不是有效的ISO 3166-1 alpha-2值	`true`	国家代码分析：无效值
`1301`	`ERROR`	无效地区格式；预期格式为COUNTRY-SUBDIVISION（例如US-CA）	`true`	地区代码分析：格式错误
`1302`	`ERROR`	无效地区代码：不是有效的ISO 3166-2细分代码	`true`	地区代码分析：未知代码
`1303`	`ERROR`	地区代码与指定的国家代码不匹配	`true`	地区代码分析：代码不匹配
`1401`	`ERROR`	无效城市名称：不允许使用占位符值	`false`	城市名称分析：占位符
`1402`	`ERROR`	无效城市名称：检测到缩写或基于代码的值	`true`	城市名称分析：缩写形式
`2401`	`WARNING`	城市名称格式不一致；建议标准化该值	`true`	城市名称分析：格式问题
`1501`	`ERROR`	RFC 8805已弃用邮政编码，出于隐私考虑必须移除	`true`	邮政编码检查
`3301`	`SUGGESTION`	对于小型地区，通常无需指定地区；建议移除地区值	`true`	调优建议：小型地区的地区值
`3402`	`SUGGESTION`	对于小型地区，通常无需指定城市粒度；建议移除城市值	`true`	调优建议：小型地区的城市值
`3303`	`SUGGESTION`	指定城市时建议同时提供地区代码；请从下拉列表中选择一个地区	`true`	调优建议：指定城市但缺少地区
`3104`	`SUGGESTION`	请确认该子网是否有意标记为无需地理定位，或是否缺少位置数据	`true`	调优建议：未指定地理定位信息

Populating Messages

填充消息

When a validation check matches, add a message to the entry's

Messages

array using the values from the reference table:

python

entry["Messages"].append({
    "ID": "1201",      # From the table
    "Type": "ERROR",   # From the table
    "Text": "Invalid country code: not a valid ISO 3166-1 alpha-2 value",  # From the table
    "Checked": True    # From the table (True = tunable)
})

After populating all messages for an entry, derive the entry-level flags:

python

entry["HasError"] = any(m["Type"] == "ERROR" for m in entry["Messages"])
entry["HasWarning"] = any(m["Type"] == "WARNING" for m in entry["Messages"])
entry["HasSuggestion"] = any(m["Type"] == "SUGGESTION" for m in entry["Messages"])
entry["Tunable"] = any(m["Checked"] for m in entry["Messages"])

当验证检查匹配时，使用参考表中的值将消息添加到条目的

Messages

数组中：

python

entry["Messages"].append({
    "ID": "1201",      # 来自参考表
    "Type": "ERROR",   # 来自参考表
    "Text": "Invalid country code: not a valid ISO 3166-1 alpha-2 value",  # 来自参考表
    "Checked": True    # 来自参考表（True表示可自动调优）
})

为条目填充所有消息后，推导条目级标志：

python

entry["HasError"] = any(m["Type"] == "ERROR" for m in entry["Messages"])
entry["HasWarning"] = any(m["Type"] == "WARNING" for m in entry["Messages"])
entry["HasSuggestion"] = any(m["Type"] == "SUGGESTION" for m in entry["Messages"])
entry["Tunable"] = any(m["Checked"] for m in entry["Messages"])

Accuracy Level Counting Rules

准确性级别计数规则

Accuracy levels are mutually exclusive. Assign each valid (non-ERROR, non-invalid) entry to exactly one bucket based on the most granular non-empty geo field:

Condition	Bucket
`City` is non-empty	`CityLevelAccuracy`
`RegionCode` non-empty AND `City` is empty	`RegionLevelAccuracy`
`CountryCode` non-empty, `RegionCode` and `City` empty	`CountryLevelAccuracy`
`DoNotGeolocate` (entry) is `true`	`DoNotGeolocate` (metadata)

Do not count entries with

HasError: true

or entries in

InvalidEntries

in any accuracy bucket.

The agent MUST NOT:

Rename fields
Add or remove fields
Change data types
Reorder keys
Alter nesting
Wrap the object
Split into multiple files

If a value is unknown, leave it empty — never invent data.

准确性级别是互斥的。根据最精细的非空地理字段，将每个有效（非ERROR、非无效）条目分配到恰好一个分类中：

条件	分类
`City` 字段非空	`CityLevelAccuracy`
`RegionCode` 非空且 `City` 为空	`RegionLevelAccuracy`
`CountryCode` 非空且 `RegionCode` 和 `City` 为空	`CountryLevelAccuracy`
条目的 `DoNotGeolocate` 为 `true`	`DoNotGeolocate` （元数据）

请勿将

HasError: true

的条目或

InvalidEntries

中的条目计入任何准确性分类。

Agent不得：

重命名字段
添加或删除字段
更改数据类型
调整键的顺序
修改嵌套结构
包装对象
拆分为多个文件

如果值未知，留空——切勿编造数据。

Structure & Format Check

结构与格式检查

This phase verifies that your feed is well-formed and parseable. Critical structural errors must be resolved before the tuner can analyze geolocation quality.

此阶段验证源文件格式是否规范、是否可解析。必须先解决关键结构错误，调优工具才能分析地理定位质量。

CSV Structure

CSV结构

This subsection defines rules for CSV-formatted input files used for IP geolocation feeds. The goal is to ensure the file can be parsed reliably and normalized into a consistent internal representation.

CSV Structure Checks
- If
```
pandas
```
  is available, use it for CSV parsing.
- Otherwise, fall back to Python's built-in
```
csv
```
  module.
- Ensure the CSV contains exactly 4 or 5 logical columns.
- Comment lines are allowed.
- A header row may or may not be present.
- If no header row exists, assume the implicit column order:
```
ip_prefix, alpha2code, region, city, postal code (deprecated)
```
- Refer to the example input file:
  assets/example/01-user-input-rfc8805-feed.csv
CSV Cleansing and Normalization
- Clean and normalize the CSV using Python logic equivalent to the following operations:
  - Select only the first five columns, dropping any columns beyond the fifth.
  - Write the output file with a UTF-8 BOM.
- Comments
  - Remove comment rows where the first column begins with
    #
    .
  - This also removes a header row if it begins with
```
#
```
    .
  - Create a map of comments using the 1-based line number as the key and the full original line as the value. Also store blank lines.
  - Store this map in a JSON file at:
    ./run/data/comments.json
  - Example:
```
{ "4": "# It's OK for small city states to leave state ISO2 code unspecified" }
```
Notes
- Both implementation paths (
```
pandas
```
  and built-in
```
csv
```
  ) must write output using the
```
utf-8-sig
```
  encoding to ensure a UTF-8 BOM is present.

本小节定义了用于IP地理定位源的CSV格式输入文件规则。目标是确保文件能够被可靠解析并标准化为一致的内部表示形式。

CSV结构检查
- 如果已安装
```
pandas
```
  ，使用它进行CSV解析。
- 否则，回退到Python内置的
```
csv
```
  模块。
- 确保CSV包含恰好4或5个逻辑列。
- 允许存在注释行。
- 可能存在或不存在表头行。
- 如果没有表头行，假设默认列顺序：
```
ip_prefix, alpha2code, region, city, postal code（已弃用）
```
- 参考示例输入文件：
  assets/example/01-user-input-rfc8805-feed.csv
CSV清理与标准化
- 使用Python逻辑对CSV进行清理和标准化，等效于以下操作：
  - 仅保留前5列，删除第5列之后的所有列。
  - 以UTF-8 BOM编码写入输出文件。
- 注释处理
  - 删除第一列以
    #
    开头的注释行。
  - 这也会删除以
```
#
```
    开头的表头行。
  - 创建注释映射，以1-based行号为键，完整原始行为值。同时存储空行。
  - 将此映射存储为JSON文件，路径为：
    ./run/data/comments.json
  - 示例：
```
{ "4": "# 小型城市国家可以不指定州ISO2代码" }
```
注意事项
- 两种实现方式（
```
pandas
```
  和内置
```
csv
```
  模块）都必须使用
```
utf-8-sig
```
  编码写入输出，确保包含UTF-8 BOM。

IP Prefix Analysis

IP前缀分析

Check that the
```
IPPrefix
```
field is present and non-empty for each entry.
- Check for duplicate
```
IPPrefix
```
  values across entries.
- If duplicates are found, stop the skill and report to the user with the message:
```
Duplicate IP prefix detected: {ip_prefix_value} appears on lines {line_numbers}
```
- If no duplicates are found, continue with the analysis.
- Checks
  - Each subnet must parse cleanly as either an IPv4 or IPv6 network using the code snippets in the
```
references/
```
    folder.
  - Subnets must be normalized and displayed in CIDR slash notation.
    - Single-host IPv4 subnets must be represented as /32
      .
    - Single-host IPv6 subnets must be represented as /128
      .
- ERROR
  - Report the following conditions as ERROR:
  - Invalid subnet syntax
    - Message ID:
      1102
  - Non-public address space
    - Applies to subnets that are private, loopback, link-local, multicast, or otherwise non-public
      - In Python, detect non-public ranges using
        is_private
        and related address properties as shown in
        ./references
        .
    - Message ID:
      1103
- SUGGESTION
  - Report the following conditions as SUGGESTION:
  - Overly large IPv6 subnets
    - Prefixes shorter than
      /64
    - Message ID:
      3102
  - Overly large IPv4 subnets
    - Prefixes shorter than
      /22
    - Message ID:
      3101

检查每个条目的
```
IPPrefix
```
字段是否存在且非空。
- 检查
```
IPPrefix
```
  值是否存在重复。
- 如果发现重复，停止技能并向用户报告：
```
检测到重复IP前缀：{ip_prefix_value}出现在行{line_numbers}
```
- 如果未发现重复，继续分析。
- 检查项
  - 每个子网必须能够使用
```
references/
```
    文件夹中的代码片段正确解析为IPv4或IPv6网络。
  - 子网必须标准化并以CIDR斜杠表示法显示。
    - 单主机IPv4子网必须表示为**
      /32
      **。
    - 单主机IPv6子网必须表示为**
      /128
      **。
- 错误（ERROR）
  - 以下情况报告为ERROR：
  - 无效子网语法
    - 消息ID：
      1102
  - 非公开地址段
    - 适用于私有、环回、链路本地、多播或其他非公开的子网
      - 在Python中，使用
        is_private
        和
        ./references
        中所示的相关地址属性检测非公开地址段。
    - 消息ID：
      1103
- 建议（SUGGESTION）
  - 以下情况报告为SUGGESTION：
  - IPv6前缀过大
    - 前缀长度小于
      /64
    - 消息ID：
      3102
  - IPv4前缀过大
    - 前缀长度小于
      /22
    - 消息ID：
      3101

Geolocation Quality Check

地理定位质量检查

Analyze the accuracy and consistency of geolocation data:

Country codes
Region codes
City names
Deprecated fields

This phase runs after structural checks pass.

分析地理定位数据的准确性和一致性：

国家代码
地区代码
城市名称
已弃用字段

此阶段在结构检查通过后运行。

Country Code Analysis

国家代码分析

Use the locally available data table
```
ISO3166-1
```
for checking.
- JSON array of countries and territories with ISO codes
- Each object includes:
  - ```
  alpha_2
```
  : two-letter country code
- ```
name
```
    : short country name
  - ```
  flag
```
  : flag emoji
- This file represents the superset of valid
  CountryCode
  values for an RFC 8805 CSV.
- Check the entry's
```
CountryCode
```
  (RFC 8805 Section 2.1.1.2, column
```
alpha2code
```
  ) against the
```
alpha_2
```
  attribute.
- Sample code is available in the
```
references/
```
  directory.
- If a country is found in
  assets/small-territories.json
  , mark the entry internally as a small territory. This flag is used in later checks and suggestions but is not stored in the output JSON (it is transient validation state).
- Note:
```
small-territories.json
```
  contains some historic/disputed codes (
```
AN
```
  ,
```
CS
```
  ,
```
XK
```
  ) that are not present in
```
iso3166-1.json
```
  . An entry using one of these as its
```
CountryCode
```
  will fail the country code validation (ERROR) even though it matches as a small territory. The country code ERROR takes precedence — do not suppress it based on the small-territory flag.
- ERROR
  - Report the following conditions as ERROR:
  - Invalid country code
    - Condition:
      CountryCode
      is present but not found in the
      alpha_2
      set
    - Message ID:
      1201
- SUGGESTION
  - Report the following conditions as SUGGESTION:
  - Unspecified geolocation for subnet
    - Condition: All geographical fields (
      CountryCode
      ,
      RegionCode
      ,
      City
      ) are empty for a subnet.
    - Action:
      - Set
        DoNotGeolocate = true
        for the entry.
      - Set
        CountryCode
        to
        ZZ
        for the entry.
    - Message ID:
      3104

使用本地可用的数据表
```
ISO3166-1
```
进行检查。
- 包含国家和地区ISO代码的JSON数组
- 每个对象包含：
  - ```
  alpha_2
```
  ：两位国家代码
- ```
name
```
    ：国家简称
  - ```
  flag
```
  ：国旗表情符号
- 此文件代表RFC 8805 CSV中
```
CountryCode
```
  值的有效全集。
- 将条目中的
```
CountryCode
```
  （RFC 8805第2.1.1.2节，列
```
alpha2code
```
  ）与
```
alpha_2
```
  属性进行比对。
- 参考代码可在
```
references/
```
  目录中找到。
- 如果某个国家在
  assets/small-territories.json
  中存在，将该条目标记为小型地区。此标志用于后续检查和建议，但不会存储在输出JSON中（属于临时验证状态）。
- 注意：
```
small-territories.json
```
  包含一些历史/有争议的代码（
```
AN
```
  、
```
CS
```
  、
```
XK
```
  ），这些代码未出现在
```
iso3166-1.json
```
  中。条目使用这些代码作为
```
CountryCode
```
  时，即使匹配小型地区，也会触发国家代码验证错误（ERROR）。国家代码ERROR优先级更高——请勿根据小型地区标志抑制该错误。
- 错误（ERROR）
  - 以下情况报告为ERROR：
  - 无效国家代码
    - 条件：
      CountryCode
      存在但未在
      alpha_2
      集合中找到
    - 消息ID：
      1201
- 建议（SUGGESTION）
  - 以下情况报告为SUGGESTION：
  - 子网未指定地理定位信息
    - 条件：子网的所有地理字段（
      CountryCode
      、
      RegionCode
      、
      City
      ）均为空。
    - 操作：
      - 将条目的
        DoNotGeolocate
        设置为
        true
        。
      - 将条目的
        CountryCode
        设置为
        ZZ
        。
    - 消息ID：
      3104

Region Code Analysis

地区代码分析

Use the locally available data table
```
ISO3166-2
```
for checking.
- JSON array of country subdivisions with ISO-assigned codes
- Each object includes:
  - ```
  code
```
  : subdivision code prefixed with country code (e.g.,
```
  US-CA
```
  )
- ```
name
```
    : short subdivision name
- This file represents the superset of valid
  RegionCode
  values for an RFC 8805 CSV.
- If a
```
RegionCode
```
  value is provided (RFC 8805 Section 2.1.1.3):
  - Check that the format matches
```
{COUNTRY}-{SUBDIVISION}
```
    (e.g.,
```
US-CA
```
    ,
```
AU-NSW
```
    ).
  - Check the value against the
```
code
```
    attribute (already prefixed with the country code).
- Small-territory exception: If the entry is a small territory and the
```
RegionCode
```
  value equals the entry's
```
CountryCode
```
  (e.g.,
```
SG
```
  as both country and region for Singapore), treat the region as acceptable — skip all region validation checks for this entry. Small territories are effectively city-states with no meaningful ISO 3166-2 administrative subdivisions.
- ERROR
  - Report the following conditions as ERROR:
  - Invalid region format
    - Condition:
      RegionCode
      does not match
      {COUNTRY}-{SUBDIVISION}
      and the small-territory exception does not apply
    - Message ID:
      1301
  - Unknown region code
    - Condition:
      RegionCode
      value is not found in the
      code
      set and the small-territory exception does not apply
    - Message ID:
      1302
  - Country–region mismatch
    - Condition: Country portion of
      RegionCode
      does not match
      CountryCode
    - Message ID:
      1303

使用本地可用的数据表
```
ISO3166-2
```
进行检查。
- 包含国家细分ISO分配代码的JSON数组
- 每个对象包含：
  - ```
  code
```
  ：带国家代码前缀的细分代码（例如
```
  US-CA
```
  ）
- ```
name
```
    ：细分地区简称
- 此文件代表RFC 8805 CSV中
```
RegionCode
```
  值的有效全集。
- 如果提供了
```
RegionCode
```
  值（RFC 8805第2.1.1.3节）：
  - 检查格式是否符合
```
{COUNTRY}-{SUBDIVISION}
```
    （例如
```
US-CA
```
    、
```
AU-NSW
```
    ）。
  - 将值与
```
code
```
    属性（已带有国家代码前缀）进行比对。
- 小型地区例外：如果条目属于小型地区且
```
RegionCode
```
  值等于条目中的
```
CountryCode
```
  （例如新加坡的
```
SG
```
  同时作为国家代码和地区代码），则认为该地区代码是可接受的——跳过该条目的所有地区验证检查。小型地区本质上是城市国家，没有有意义的ISO 3166-2行政细分。
- 错误（ERROR）
  - 以下情况报告为ERROR：
  - 无效地区格式
    - 条件：
      RegionCode
      不符合
      {COUNTRY}-{SUBDIVISION}
      格式且不适用小型地区例外
    - 消息ID：
      1301
  - 未知地区代码
    - 条件：
      RegionCode
      值未在
      code
      集合中找到且不适用小型地区例外
    - 消息ID：
      1302
  - 国家-地区代码不匹配
    - 条件：
      RegionCode
      中的国家部分与
      CountryCode
      不匹配
    - 消息ID：
      1303

City Name Analysis

城市名称分析

City names are validated using heuristic checks only.
- There is currently no authoritative dataset available for validating city names.
- ERROR
  - Report the following conditions as ERROR:
  - Placeholder or non-meaningful values
    - Condition: Placeholder or non-meaningful values including but not limited to:
      - undefined
      - Please select
      - null
      - N/A
      - TBD
      - unknown
    - Message ID:
      1401
  - Truncated names, abbreviations, or airport codes
    - Condition: Truncated names, abbreviations, or airport codes that do not represent valid city names:
      - LA
      - Frft
      - sin01
      - LHR
      - SIN
      - MAA
    - Message ID:
      1402
- WARNING
  - Report the following conditions as WARNING:
  - Inconsistent casing or formatting
    - Condition: City names with inconsistent casing, spacing, or formatting that may reduce data quality, for example:
      - HongKong
        vs
        Hong Kong
      - Mixed casing or unexpected script usage
    - Message ID:
      2401

城市名称仅通过启发式检查进行验证。
- 目前没有权威数据集可用于验证城市名称。
- 错误（ERROR）
  - 以下情况报告为ERROR：
  - 占位符或无意义值
    - 条件：包含占位符或无意义值，包括但不限于：
      - undefined
      - Please select
      - null
      - N/A
      - TBD
      - unknown
    - 消息ID：
      1401
  - 截断名称、缩写或机场代码
    - 条件：检测到截断名称、缩写或机场代码，不代表有效城市名称：
      - LA
      - Frft
      - sin01
      - LHR
      - SIN
      - MAA
    - 消息ID：
      1402
- 警告（WARNING）
  - 以下情况报告为WARNING：
  - 格式不一致
    - 条件：城市名称大小写、空格或格式不一致，可能降低数据质量，例如：
      - HongKong
        vs
        Hong Kong
      - 大小写混合或使用意外的脚本
    - 消息ID：
      2401

Postal Code Check

邮政编码检查

RFC 8805 Section 2.1.1.5 explicitly deprecates postal or ZIP codes.
- Postal codes can represent very small populations and are not considered privacy-safe for mapping IP address ranges, which are statistical in nature.
- ERROR
  - Report the following conditions as ERROR:
  - Postal code present
    - Condition: A non-empty value is present in the postal/ZIP code field.
    - Message ID:
      1501

RFC 8805第2.1.1.5节明确弃用邮政编码或ZIP代码。
- 邮政编码代表的人口范围非常小，对于统计性质的IP地址范围映射来说不符合隐私安全要求。
- 错误（ERROR）
  - 以下情况报告为ERROR：
  - 存在邮政编码
    - 条件：邮政编码/ZIP代码字段存在非空值。
    - 消息ID：
      1501

Tuning & Recommendations

调优与建议

This phase applies opinionated recommendations beyond RFC 8805, learned from real-world geofeed deployments, that improve accuracy and usability.

SUGGESTION
- Report the following conditions as SUGGESTION:
- Region or city specified for small territory
  - Condition:
    - Entry is a small territory
    - RegionCode
      is non-empty OR
    - City
      is non-empty.
  - Message IDs:
```
3301
```
    (for region),
```
3402
```
    (for city)
- Missing region code when city is specified
  - Condition:
    - City
      is non-empty
    - RegionCode
      is empty
    - Entry is not a small territory
  - Message ID:
```
3303
```

此阶段应用超出RFC 8805要求的经验性建议，这些建议来自实际geofeed部署经验，可提升准确性和可用性。

建议（SUGGESTION）
- 以下情况报告为SUGGESTION：
- 小型地区指定了地区或城市
  - 条件：
    - 条目属于小型地区
    - RegionCode
      非空或
    - City
      非空。
  - 消息ID：
```
3301
```
    （针对地区）、
```
3402
```
    （针对城市）
- 指定城市但缺少地区代码
  - 条件：
    - City
      非空
    - RegionCode
      为空
    - 条目不属于小型地区
  - 消息ID：
```
3303
```

Phase 4: Tuning Data Lookup

阶段4：调优数据查询

Objective

目标

Lookup all the

Entries

using Fastah's

rfc8805-row-place-search

tool.

使用Fastah的

rfc8805-row-place-search

工具查询所有

Entries

。

Execution Rules

执行规则

Generate a new script only for payload generation (read the dataset and write one or more payload JSON files; do not call MCP from this script).
Server only accepts 1000 entries per request, so if there are more than 1000 entries, split into multiple requests.
The agent must read the generated payload files, construct the requests from them, and send those requests to the MCP server in batches of at most 1000 entries each.
On MCP failure: If the MCP server is unreachable, returns an error, or returns no results for any batch, log a warning and continue to Phase 5. Set
```
TunedEntry: {}
```
for all affected entries. Do not block report generation. Notify the user clearly:
```
Tuning data lookup unavailable; the report will show validation results only.
```
Suggestions are advisory only — never auto-populate them.

仅为生成请求负载创建一个新脚本（读取数据集并写入一个或多个负载JSON文件；请勿在此脚本中调用MCP）。
服务器每个请求最多接受1000个条目，因此如果条目超过1000个，拆分到多个请求中。
Agent必须读取生成的负载文件，从中构造请求，并以最多1000个条目为一批发送到MCP服务器。
**MCP失败处理：**如果MCP服务器无法访问、返回错误或任何批次未返回结果，记录警告并继续到阶段5。将受影响条目的
```
TunedEntry: {}
```
设置为空对象。请勿阻止报告生成。向用户明确通知：
```
调优数据查询不可用；报告将仅显示验证结果。
```
建议仅作为参考——切勿自动填充。

Step 1: Build Lookup Payload with Deduplication

步骤1：构建去重后的查询负载

Load the dataset from: ./run/data/report-data.json

Read the
```
Entries
```
array. Each entry will be used to build the MCP lookup payload.

Reduce server requests by deduplicating identical entries:

For each entry in
```
Entries
```
, compute a content hash (hash of
```
CountryCode
```
+
```
RegionCode
```
+
```
City
```
).
Create a deduplication map:
```
{ contentHash -> { rowKey, payload, entryIndices: [] } }
```
. rowKey is a UUID that will be sent to the MCP server for matching responses.
If an entry's hash already exists, append its 0-based array index in
```
Entries
```
to that deduplication entry's
```
entryIndices
```
array.
If hash is new, generate a UUID (rowKey) and create a new deduplication entry.

Build request batches:

Extract unique deduplicated entries from the map, keeping them in deduplication order.
Build request batches of up to 1000 items each.
For each batch, keep an in-memory structure like
```
[{ rowKey, payload, entryIndices }, ...]
```
to match responses back by rowKey.
When writing the MCP payload file, include the
```
rowKey
```
field with each payload object:

json

[
    {"rowKey": "550e8400-e29b-41d4-a716-446655440000", "countryCode":"CA","regionCode":"CA-ON","cityName":"Toronto"},
    {"rowKey": "6ba7b810-9dad-11d1-80b4-00c04fd430c8", "countryCode":"IN","regionCode":"IN-KA","cityName":"Bangalore"},
    {"rowKey": "6ba7b811-9dad-11d1-80b4-00c04fd430c8", "countryCode":"IN","regionCode":"IN-KA"}
]

When reading responses, match each response
```
rowKey
```
field to the corresponding deduplication entry to retrieve all associated
```
entryIndices
```
.

Rules:

Write payload to: ./run/data/mcp-server-payload.json
Exit the script after writing the payload.

从./run/data/report-data.json加载数据集。

读取
```
Entries
```
数组。每个条目将用于构建MCP查询负载。

通过去重相同条目减少服务器请求：

对于
```
Entries
```
中的每个条目，计算内容哈希（
```
CountryCode
```
+
```
RegionCode
```
+
```
City
```
的哈希值）。
创建去重映射：
```
{ contentHash -> { rowKey, payload, entryIndices: [] } }
```
。rowKey是将发送到MCP服务器用于匹配响应的UUID。
如果条目的哈希已存在，将其在
```
Entries
```
中的0-based数组索引追加到该去重条目的
```
entryIndices
```
数组中。
如果哈希是新的，生成一个**UUID（rowKey）**并创建一个新的去重条目。

构建请求批次：

从映射中提取唯一的去重条目，保持去重顺序。
构建最多包含1000个条目的请求批次。
对于每个批次，保留内存结构如
```
[{ rowKey, payload, entryIndices: [] }, ...]
```
，以便通过rowKey匹配响应。
写入MCP负载文件时，每个负载对象包含
```
rowKey
```
字段：

json

[
    {"rowKey": "550e8400-e29b-41d4-a716-446655440000", "countryCode":"CA","regionCode":"CA-ON","cityName":"Toronto"},
    {"rowKey": "6ba7b810-9dad-11d1-80b4-00c04fd430c8", "countryCode":"IN","regionCode":"IN-KA","cityName":"Bangalore"},
    {"rowKey": "6ba7b811-9dad-11d1-80b4-00c04fd430c8", "countryCode":"IN","regionCode":"IN-KA"}
]

读取响应时，将每个响应中的
```
rowKey
```
字段与去重映射中的对应条目进行匹配，以获取所有关联的
```
entryIndices
```
。

规则：

将负载写入：./run/data/mcp-server-payload.json
写入负载后退出脚本。

Step 2: Invoke Fastah MCP Tool

步骤2：调用Fastah MCP工具

An example
```
mcp.json
```
style configuration of Fastah MCP server is as follows:

json

    "fastah-ip-geofeed": {
      "type": "http",
      "url": "https://mcp.fastah.ai/mcp"
    }

Server:
```
https://mcp.fastah.ai/mcp
```
Tool and its Schema: before the first
```
tools/call
```
, the agent MUST send a
```
tools/list
```
request to read the input and output schema for rfc8805-row-place-search
. Use the discovered schema as the authoritative source for field names, types, and constraints.

The following is an illustrative example only; always defer to the schema returned by

tools/list

json

[
    {"rowKey": "550e8400-...", "countryCode":"CA", ...},
    {"rowKey": "690e9301-...", "countryCode":"ZZ", ...}
]

Open ./run/data/mcp-server-payload.json and send all deduplicated entries with their rowKeys.
If there are more than 1000 deduplicated entries after deduplication, split into multiple requests of 1000 entries each.
The server will respond with the same
```
rowKey
```
field in each response for mapping back.
Do NOT use local data.

Fastah MCP服务器的
```
mcp.json
```
风格配置示例如下：

json

    "fastah-ip-geofeed": {
      "type": "http",
      "url": "https://mcp.fastah.ai/mcp"
    }

服务器地址：
```
https://mcp.fastah.ai/mcp
```
工具及其Schema：在第一次
```
tools/call
```
之前，Agent必须发送
```
tools/list
```
请求，读取**
```
rfc8805-row-place-search
```
**的输入和输出Schema。使用返回的Schema作为字段名、类型和约束的权威来源。

以下仅为示例说明；请始终以

tools/list

返回的Schema为准：

json

[
    {"rowKey": "550e8400-...", "countryCode":"CA", ...},
    {"rowKey": "690e9301-...", "countryCode":"ZZ", ...}
]

打开./run/data/mcp-server-payload.json，发送所有带rowKey的去重条目。
如果去重后条目超过1000个，拆分为多个请求，每个请求最多1000个条目。
服务器将在每个响应中返回相同的
```
rowKey
```
字段，用于映射回原始条目。
请勿使用本地数据。

Step 3: Attach Tuned Data to Entries

步骤3：将调优数据附加到条目

Generate a new script for attaching tuned data.
Load both ./run/data/report-data.json and the deduplication map (held in memory from Step 1, or re-derived from the payload file).
For each response from the MCP server:
- Extract the
```
rowKey
```
  from the response.
- Look up the
```
entryIndices
```
  array associated with that
```
rowKey
```
  from the deduplication map.
- For each index in
```
entryIndices
```
  , attach the best match to
```
Entries[index]
```
  .
Use the first (best) match from the response when available.

Create the field on each affected entry if it does not exist. Remap the MCP API response keys to Go struct field names:

json

"TunedEntry": {
  "Name": "",
  "CountryCode": "",
  "RegionCode": "",
  "PlaceType": "",
  "H3Cells": [],
  "BoundingBox": []
}

The

TunedEntry

field is a single object (not an array). It holds the best match from the MCP server.

MCP response key → JSON key mapping:

MCP API response key	JSON key
`placeName`	`Name`
`countryCode`	`CountryCode`
`stateCode`	`RegionCode`
`placeType`	`PlaceType`
`h3Cells`	`H3Cells`
`boundingBox`	`BoundingBox`

Entries with no UUID match (i.e. the MCP server returned no response for their UUID) must receive an empty

TunedEntry: {}

object — never leave the field absent.

Write the dataset back to: ./run/data/report-data.json
Rules:
- Maintain all existing validation flags.
- Do NOT create additional intermediate files.

生成一个新脚本用于附加调优数据。
加载./run/data/report-data.json和去重映射（从步骤1的内存中获取，或从负载文件重新推导）。
对于MCP服务器返回的每个响应：
- 从响应中提取
```
rowKey
```
  。
- 从去重映射中查找与该
```
rowKey
```
  关联的
```
entryIndices
```
  数组。
- 对于数组中的每个索引，将最佳匹配结果附加到
```
Entries[index]
```
  。
如果有可用结果，使用第一个（最佳）匹配。

如果条目不存在该字段，则创建该字段。将MCP API响应键映射为Go结构体字段名：

json

"TunedEntry": {
  "Name": "",
  "CountryCode": "",
  "RegionCode": "",
  "PlaceType": "",
  "H3Cells": [],
  "BoundingBox": []
}

TunedEntry

字段是一个单个对象（不是数组）。它保存来自MCP服务器的最佳匹配结果。

MCP响应键 → JSON键映射:

MCP API响应键	JSON键
`placeName`	`Name`
`countryCode`	`CountryCode`
`stateCode`	`RegionCode`
`placeType`	`PlaceType`
`h3Cells`	`H3Cells`
`boundingBox`	`BoundingBox`

对于没有UUID匹配的条目（即MCP服务器未返回其UUID的响应），必须设置

TunedEntry: {}

——切勿省略该字段。

将数据集写回：./run/data/report-data.json
规则：
- 保留所有现有验证标志。
- 请勿创建额外的中间文件。

Phase 5: Generate Tuning Report

阶段5：生成调优报告

Generate a self-contained HTML report by rendering the template at

./scripts/templates/index.html

with data from

./run/data/report-data.json

and

./run/data/comments.json

Write the completed report to

./run/report/geofeed-report.html

. After generating, attempt to open it in the system's default browser (e.g.,

webbrowser.open()

). If running in a headless environment, CI pipeline, or remote container where no browser is available, skip the browser step and instead present the file path to the user so they can open or download it.

The template uses Go
html/template
syntax (

{{.Field}}

{{range}}

{{if eq}}

, etc.). Write a Python script that reads the template, builds a rendering context from the JSON data files, and processes the template placeholders to produce final HTML. Do not modify the template file itself — all processing happens in the Python script at render time.

通过渲染

./scripts/templates/index.html

模板，结合

./run/data/report-data.json

和

./run/data/comments.json

中的数据，生成独立HTML报告。

将完成的报告写入

./run/report/geofeed-report.html

。生成后，尝试在系统默认浏览器中打开（例如使用

webbrowser.open()

）。如果在无头环境、CI流水线或远程容器中运行，且没有可用浏览器，跳过打开浏览器步骤，而是向用户提供文件路径，以便他们打开或下载。

模板使用Go
html/template
语法（

{{.Field}}

、

{{range}}

、

{{if eq}}

等）。编写Python脚本读取模板，从JSON数据文件构建渲染上下文，并处理模板占位符以生成最终HTML。请勿修改模板文件本身——所有处理都在Python脚本渲染时完成。

Step 1: Replace Metadata Placeholders

步骤1：替换元数据占位符

Replace each

{{.Metadata.X}}

placeholder in the template with the corresponding value from

report-data.json

. Since JSON keys match the template placeholder, the mapping is direct —

{{.Metadata.InputFile}}

maps to the

InputFile

JSON key, etc.

Template placeholder	JSON key ( `report-data.json` )
`{{.Metadata.InputFile}}`	`InputFile`
`{{.Metadata.Timestamp}}`	`Timestamp`
`{{.Metadata.TotalEntries}}`	`TotalEntries`
`{{.Metadata.IpV4Entries}}`	`IpV4Entries`
`{{.Metadata.IpV6Entries}}`	`IpV6Entries`
`{{.Metadata.InvalidEntries}}`	`InvalidEntries`
`{{.Metadata.Errors}}`	`Errors`
`{{.Metadata.Warnings}}`	`Warnings`
`{{.Metadata.Suggestions}}`	`Suggestions`
`{{.Metadata.OK}}`	`OK`
`{{.Metadata.CityLevelAccuracy}}`	`CityLevelAccuracy`
`{{.Metadata.RegionLevelAccuracy}}`	`RegionLevelAccuracy`
`{{.Metadata.CountryLevelAccuracy}}`	`CountryLevelAccuracy`
`{{.Metadata.DoNotGeolocate}}`	`DoNotGeolocate` (metadata)

Note on
{{.Metadata.Timestamp}}
: This placeholder appears inside a JavaScript

new Date(...)

call. Replace it with the raw integer value (no HTML escaping needed for a numeric literal inside

<script>

). All other metadata values should be HTML-escaped since they appear inside HTML element text.

将模板中的每个

{{.Metadata.X}}

占位符替换为

report-data.json

中的对应值。由于JSON键与模板占位符匹配，映射是直接的——

{{.Metadata.InputFile}}

映射到JSON键

InputFile

，以此类推。

模板占位符	JSON键（ `report-data.json` ）
`{{.Metadata.InputFile}}`	`InputFile`
`{{.Metadata.Timestamp}}`	`Timestamp`
`{{.Metadata.TotalEntries}}`	`TotalEntries`
`{{.Metadata.IpV4Entries}}`	`IpV4Entries`
`{{.Metadata.IpV6Entries}}`	`IpV6Entries`
`{{.Metadata.InvalidEntries}}`	`InvalidEntries`
`{{.Metadata.Errors}}`	`Errors`
`{{.Metadata.Warnings}}`	`Warnings`
`{{.Metadata.Suggestions}}`	`Suggestions`
`{{.Metadata.OK}}`	`OK`
`{{.Metadata.CityLevelAccuracy}}`	`CityLevelAccuracy`
`{{.Metadata.RegionLevelAccuracy}}`	`RegionLevelAccuracy`
`{{.Metadata.CountryLevelAccuracy}}`	`CountryLevelAccuracy`
`{{.Metadata.DoNotGeolocate}}`	`DoNotGeolocate` （元数据）

**关于

{{.Metadata.Timestamp}}

的注意事项：**该占位符出现在JavaScript

new Date(...)

调用中。直接替换为原始整数值（

<script>

中的数值字面量无需HTML转义）。所有其他元数据值应进行HTML转义，因为它们出现在HTML元素文本中。

Step 2: Replace the Comment Map Placeholder

步骤2：替换注释映射占位符

Locate this pattern in the template:

javascript

const commentMap = {{.Comments}};

Replace

{{.Comments}}

with the serialized JSON object from

./run/data/comments.json

. The JSON is embedded directly as a JavaScript object literal (not inside a string), so no extra escaping is needed:

python

comments_json = json.dumps(comments)
template = template.replace("{{.Comments}}", comments_json)

在模板中找到以下模式：

javascript

const commentMap = {{.Comments}};

将

{{.Comments}}

替换为

./run/data/comments.json

中的序列化JSON对象。JSON直接作为JavaScript对象字面量嵌入（不在字符串内），因此无需额外转义：

python

comments_json = json.dumps(comments)
template = template.replace("{{.Comments}}", comments_json)

Step 3: Expand the Entries Range Block

步骤3：展开条目循环块

The template contains a

{{range .Entries}}...{{end}}

block inside

<tbody id="entriesTableBody">

. Process it as follows:

Extract the range block body using regex. Critical: The block contains nested

{{end}}

tags (from

{{if eq .Status ...}}

{{if .Checked}}

, and

{{range .Messages}}

). A naive non-greedy match like

\{\{range \.Entries\}\}(.*?)\{\{end\}\}

will match the first inner

{{end}}

, truncating the block. Instead, anchor the outer

{{end}}

to the

</tbody>

that follows it:

python

m = re.search(
    r'\{\{range \.Entries\}\}(.*?)\{\{end\}\}\s*</tbody>',
    template,
    re.DOTALL,
)
entry_body = m.group(1)  # template text for one entry iteration

This ensures you capture the full block body including all three

<tr>

rows and the nested

{{range .Messages}}...{{end}}

Iterate over each entry in
```
report-data.json
```
's
```
Entries
```
array.
Expand the block body for each entry using the processing order below.
Replace the entire match (from
```
{{range .Entries}}
```
through
```
</tbody>
```
) with the concatenated expanded HTML followed by
```
</tbody>
```
.

Processing order for each entry (innermost constructs first to avoid

{{end}}

confusion):

Evaluate
```
{{if eq .Status ...}}...{{end}}
```
conditionals (status badge class and icon).
Evaluate
```
{{if .Checked}}...{{end}}
```
conditional (message checkbox).
Expand
```
{{range .Messages}}...{{end}}
```
inner range.
Replace simple
```
{{.Field}}
```
placeholders.

模板在

<tbody id="entriesTableBody">

内部包含一个

{{range .Entries}}...{{end}}

块。按以下方式处理：

提取循环块主体（使用正则表达式）。关键：块中包含嵌套的
{{end}}
标签（来自
{{if eq .Status ...}}
、
{{if .Checked}}
和
{{range .Messages}}
）。简单的非贪婪匹配如
\{\{range \.Entries\}\}(.*?)\{\{end\}\}
会匹配第一个内部

{{end}}

，导致块被截断。相反，将外部

{{end}}

锚定到其后的

</tbody>

：

python

m = re.search(
    r'\{\{range \.Entries\}\}(.*?)\{\{end\}\}\s*</tbody>',
    template,
    re.DOTALL,
)
entry_body = m.group(1)  # 单个条目的模板文本

这样可确保捕获完整的块主体，包括所有三个

<tr>

行和嵌套的

{{range .Messages}}...{{end}}

。

遍历
```
report-data.json
```
中
```
Entries
```
数组的每个条目。
按以下处理顺序为每个条目展开块主体。
替换整个匹配内容（从
```
{{range .Entries}}
```
到
```
</tbody>
```
）为拼接后的展开HTML，再加上
```
</tbody>
```
。

每个条目的处理顺序（先处理最内层结构，避免

{{end}}

混淆）：

计算
```
{{if eq .Status ...}}...{{end}}
```
条件（状态徽章类和图标）。
计算
```
{{if .Checked}}...{{end}}
```
条件（消息复选框）。
展开
```
{{range .Messages}}...{{end}}
```
内部循环。
替换简单的
```
{{.Field}}
```
占位符。

Entry Field Mapping

条目字段映射

Within the range block body, replace these placeholders for each entry. Since JSON keys match the template placeholder, the template placeholder

{{.X}}

maps directly to JSON key

Template placeholder	JSON key ( `Entries[]` )	Notes
`{{.Line}}`	`Line`	Direct integer value
`{{.IPPrefix}}`	`IPPrefix`	HTML-escaped
`{{.CountryCode}}`	`CountryCode`	HTML-escaped
`{{.RegionCode}}`	`RegionCode`	HTML-escaped
`{{.City}}`	`City`	HTML-escaped
`{{.Status}}`	`Status`	HTML-escaped
`{{.HasError}}`	`HasError`	Lowercase string: `"true"` or `"false"`
`{{.HasWarning}}`	`HasWarning`	Lowercase string: `"true"` or `"false"`
`{{.HasSuggestion}}`	`HasSuggestion`	Lowercase string: `"true"` or `"false"`
`{{.GeocodingHint}}`	`GeocodingHint`	Empty string `""`
`{{.DoNotGeolocate}}`	`DoNotGeolocate`	`"true"` or `"false"`
`{{.Tunable}}`	`Tunable`	`"true"` or `"false"`
`{{.TunedEntry.CountryCode}}`	`TunedEntry.CountryCode`	`""` if `TunedEntry` is empty `{}`
`{{.TunedEntry.RegionCode}}`	`TunedEntry.RegionCode`	`""` if `TunedEntry` is empty `{}`
`{{.TunedEntry.Name}}`	`TunedEntry.Name`	`""` if `TunedEntry` is empty `{}`
`{{.TunedEntry.H3Cells}}`	`TunedEntry.H3Cells`	Bracket-wrapped space-separated; `"[]"` if empty (see format below)
`{{.TunedEntry.BoundingBox}}`	`TunedEntry.BoundingBox`	Bracket-wrapped space-separated; `"[]"` if empty (see format below)

data-h3-cells
and
data-bounding-box
format: These are NOT JSON arrays. They are bracket-wrapped, space-separated values. Do not use JSON serialization (no quotes around string elements, no commas between numbers). Examples:

```
[836752fffffffff 836755fffffffff]
```
— correct
```
["836752fffffffff","836755fffffffff"]
```
— WRONG, quotes will break parsing
```
[-71.70 10.73 -71.52 10.55]
```
— correct
```
[]
```
— correct for empty

在循环块主体中，为每个条目替换以下占位符。由于JSON键与模板占位符匹配，模板占位符

{{.X}}

直接映射到JSON键

：

模板占位符	JSON键（ `Entries[]` ）	说明
`{{.Line}}`	`Line`	直接整数值
`{{.IPPrefix}}`	`IPPrefix`	HTML转义后的值
`{{.CountryCode}}`	`CountryCode`	HTML转义后的值
`{{.RegionCode}}`	`RegionCode`	HTML转义后的值
`{{.City}}`	`City`	HTML转义后的值
`{{.Status}}`	`Status`	HTML转义后的值
`{{.HasError}}`	`HasError`	小写字符串： `"true"` 或 `"false"`
`{{.HasWarning}}`	`HasWarning`	小写字符串： `"true"` 或 `"false"`
`{{.HasSuggestion}}`	`HasSuggestion`	小写字符串： `"true"` 或 `"false"`
`{{.GeocodingHint}}`	`GeocodingHint`	Empty string `""`
`{{.DoNotGeolocate}}`	`DoNotGeolocate`	`"true"` 或 `"false"`
`{{.Tunable}}`	`Tunable`	`"true"` 或 `"false"`
`{{.TunedEntry.CountryCode}}`	`TunedEntry.CountryCode`	如果 `TunedEntry` 为空 `{}` 则为 `""`
`{{.TunedEntry.RegionCode}}`	`TunedEntry.RegionCode`	如果 `TunedEntry` 为空 `{}` 则为 `""`
`{{.TunedEntry.Name}}`	`TunedEntry.Name`	如果 `TunedEntry` 为空 `{}` 则为 `""`
`{{.TunedEntry.H3Cells}}`	`TunedEntry.H3Cells`	括号包裹的空格分隔值；空值时为 `"[]"` （格式见下文）
`{{.TunedEntry.BoundingBox}}`	`TunedEntry.BoundingBox`	括号包裹的空格分隔值；空值时为 `"[]"` （格式见下文）

data-h3-cells
和
data-bounding-box
格式：这些不是JSON数组。它们是括号包裹、空格分隔的值。请勿使用JSON序列化（字符串元素无需加引号，数字之间无需逗号）。示例：

```
[836752fffffffff 836755fffffffff]
```
— 正确
```
["836752fffffffff","836755fffffffff"]
```
— 错误，引号会导致解析失败
```
[-71.70 10.73 -71.52 10.55]
```
— 正确
```
[]
```
— 空值时正确

Evaluating Status Conditionals

计算状态条件

Process these BEFORE replacing simple
{{.Field}}
placeholders — otherwise the

{{end}}

markers get consumed and the regex won't match.

The template uses

{{if eq .Status "..."}}

conditionals for the status badge CSS class and icon. Evaluate these by checking the entry's

status

value and keeping only the matching branch text.

The status badge line contains two

{{if eq .Status ...}}...{{end}}

blocks on a single line — one for the CSS class, one for the icon. Use

re.sub

with a callback to resolve all occurrences:

python

STATUS_CSS = {"ERROR": "error", "WARNING": "warning", "SUGGESTION": "suggestion", "OK": "ok"}
STATUS_ICON = {
    "ERROR": "bi-x-circle-fill",
    "WARNING": "bi-exclamation-triangle-fill",
    "SUGGESTION": "bi-lightbulb-fill",
    "OK": "bi-check-circle-fill",
}

def resolve_status_if(match_obj, status):
    """Pick the branch matching `status` from a {{if eq .Status ...}}...{{end}} block."""
    block = match_obj.group(0)
    # Try each branch: {{if eq .Status "X"}}val{{else if ...}}val{{else}}val{{end}}
    for st, val in [("ERROR",), ("WARNING",), ("SUGGESTION",)]:
        # not needed to parse generically — just map from the known patterns
    ...

A simpler approach: since there are exactly two known patterns, replace them as literal strings:

python

css_class = STATUS_CSS.get(status, "ok")
icon_class = STATUS_ICON.get(status, "bi-check-circle-fill")
body = body.replace(
    '{{if eq .Status "ERROR"}}error{{else if eq .Status "WARNING"}}warning{{else if eq .Status "SUGGESTION"}}suggestion{{else}}ok{{end}}',
    css_class,
)
body = body.replace(
    '{{if eq .Status "ERROR"}}bi-x-circle-fill{{else if eq .Status "WARNING"}}bi-exclamation-triangle-fill{{else if eq .Status "SUGGESTION"}}bi-lightbulb-fill{{else}}bi-check-circle-fill{{end}}',
    icon_class,
)

This avoids regex entirely and is safe because these exact strings appear verbatim in the template.

在替换简单
{{.Field}}
占位符之前处理这些条件——否则

{{end}}

标记会被消耗，导致正则表达式无法匹配。

模板使用

{{if eq .Status "..."}}

条件来设置状态徽章的CSS类和图标。通过检查条目的

status

值，仅保留匹配分支的文本。

状态徽章行包含两个

{{if eq .Status ...}}...{{end}}

块——一个用于CSS类，一个用于图标。使用

re.sub

和回调函数解析所有匹配项：

python

STATUS_CSS = {"ERROR": "error", "WARNING": "warning", "SUGGESTION": "suggestion", "OK": "ok"}
STATUS_ICON = {
    "ERROR": "bi-x-circle-fill",
    "WARNING": "bi-exclamation-triangle-fill",
    "SUGGESTION": "bi-lightbulb-fill",
    "OK": "bi-check-circle-fill",
}

def resolve_status_if(match_obj, status):
    """从{{if eq .Status ...}}...{{end}}块中选择与`status`匹配的分支。"""
    block = match_obj.group(0)
    # 尝试每个分支：{{if eq .Status "X"}}val{{else if ...}}val{{else}}val{{end}}
    for st, val in [("ERROR",), ("WARNING",), ("SUGGESTION",)]:
        # 无需通用解析——只需根据已知模式映射
    ...

更简单的方法：由于只有两个已知模式，直接替换为字面字符串：

python

css_class = STATUS_CSS.get(status, "ok")
icon_class = STATUS_ICON.get(status, "bi-check-circle-fill")
body = body.replace(
    '{{if eq .Status "ERROR"}}error{{else if eq .Status "WARNING"}}warning{{else if eq .Status "SUGGESTION"}}suggestion{{else}}ok{{end}}',
    css_class,
)
body = body.replace(
    '{{if eq .Status "ERROR"}}bi-x-circle-fill{{else if eq .Status "WARNING"}}bi-exclamation-triangle-fill{{else if eq .Status "SUGGESTION"}}bi-lightbulb-fill{{else}}bi-check-circle-fill{{end}}',
    icon_class,
)

这样可避免使用正则表达式，且安全可靠，因为这些字符串在模板中是固定的。

Step 4: Expand the Nested Messages Range

步骤4：展开嵌套消息循环

The

{{range .Messages}}...{{end}}

block contains a nested

{{if .Checked}} checked{{else}} disabled{{end}}

conditional, so its inner

{{end}}

would cause a simple non-greedy regex to match too early. Anchor the regex to

</td>

(the tag immediately after the messages range closing

{{end}}

) to capture the full block body:

python

msg_match = re.search(
    r'\{\{range \.Messages\}\}(.*?)\{\{end\}\}\s*(?=</td>)',
    body, re.DOTALL
)

The lookahead

(?=</td>)

ensures the regex skips past the checkbox conditional's

{{end}}

(which is followed by

, not

</td>

) and matches only the range-closing

{{end}}

(which is followed by whitespace then

</td>

For each message in the entry's

Messages

array, clone the captured block body and expand it:

Resolve the checkbox conditional per message (must happen before simple placeholder replacement to remove the nested

{{end}}

python

if msg.get("Checked"):
    msg_body = msg_body.replace(
        '{{if .Checked}} checked{{else}} disabled{{end}}', ' checked'
    )
else:
    msg_body = msg_body.replace(
        '{{if .Checked}} checked{{else}} disabled{{end}}', ' disabled'
    )

Replace message field placeholders:
Template placeholder Source Notes
{{.ID}}
Messages[i].ID
Direct string value from JSON
{{.Text}}
Messages[i].Text
HTML-escaped

Template placeholder	Source	Notes
`{{.ID}}`	`Messages[i].ID`	Direct string value from JSON
`{{.Text}}`	`Messages[i].Text`	HTML-escaped

Concatenate all expanded message blocks and replace the original

{{range .Messages}}...{{end}}

match (

msg_match.group(0)

) with the result:

python

body = body[:msg_match.start()] + "".join(expanded_msgs) + body[msg_match.end():]

Messages

is empty, replace the entire matched region with an empty string (no message divs — only the issues header remains).

{{range .Messages}}...{{end}}

块包含一个嵌套的

{{if .Checked}} checked{{else}} disabled{{end}}

条件，因此其内部的

{{end}}

会导致简单的非贪婪正则表达式过早匹配。将正则表达式锚定到

</td>

（紧跟在消息循环结束

{{end}}

之后的标签），以捕获完整的块主体：

python

msg_match = re.search(
    r'\{\{range \.Messages\}\}(.*?)\{\{end\}\}\s*(?=</td>)',
    body, re.DOTALL
)

前瞻断言

(?=</td>)

确保正则表达式跳过复选框条件的

{{end}}

（其后是

，而非

</td>

），仅匹配循环结束的

{{end}}

（其后是空格和

</td>

）。

对于条目中

Messages

数组的每个消息，克隆捕获的块主体并展开：

解析复选框条件（每条消息）：必须在替换简单占位符之前处理，以避免嵌套

{{end}}

的混淆：

python

if msg.get("Checked"):
    msg_body = msg_body.replace(
        '{{if .Checked}} checked{{else}} disabled{{end}}', ' checked'
    )
else:
    msg_body = msg_body.replace(
        '{{if .Checked}} checked{{else}} disabled{{end}}', ' disabled'
    )

替换消息字段占位符:
模板占位符来源说明
{{.ID}}
Messages[i].ID
直接使用JSON中的字符串值
{{.Text}}
Messages[i].Text
HTML转义后的值

模板占位符	来源	说明
`{{.ID}}`	`Messages[i].ID`	直接使用JSON中的字符串值
`{{.Text}}`	`Messages[i].Text`	HTML转义后的值

拼接所有展开的消息块，并将原始

{{range .Messages}}...{{end}}

匹配内容（

msg_match.group(0)

）替换为结果：

python

body = body[:msg_match.start()] + "".join(expanded_msgs) + body[msg_match.end():]

如果

Messages

为空，将整个匹配区域替换为空字符串（无消息div——仅保留问题标题）。

Output Guarantees

输出保证

The report must be readable in any modern browser without extra network dependencies beyond the CDN links already in the template (
```
leaflet
```
,
```
h3-js
```
,
```
bootstrap-icons
```
, Raleway font).
All values embedded in HTML must be HTML-escaped (
```
<
```
,
```
>
```
,
```
&
```
,
```
"
```
) to prevent rendering issues.
```
commentMap
```
is embedded as a direct JavaScript object literal (not inside a string), so no JS string escaping is needed — just emit valid JSON.
All values must be derived only from analysis output, not recomputed heuristically.

报告必须可在任何现代浏览器中读取，无需模板中已有的CDN链接（
```
leaflet
```
、
```
h3-js
```
、
```
bootstrap-icons
```
、Raleway字体）之外的额外网络依赖。
嵌入HTML的所有值必须进行HTML转义（
```
<
```
、
```
>
```
、
```
&
```
、
```
"
```
），以避免渲染问题。
```
commentMap
```
直接作为JavaScript对象字面量嵌入（不在字符串内），因此无需JS字符串转义——只需输出有效的JSON。
所有值必须仅来自分析输出，而非通过启发式重新计算。

Phase 6: Final Review

阶段6：最终审核

Perform a final verification pass using concrete, checkable assertions before presenting results to the user.

Check 1 — Entry count integrity

Count non-comment, non-blank data rows in the original input CSV.

Assert:

len(entries) in report-data.json == data_row_count

On failure:

Row count mismatch: input has {N} data rows but report contains {M} entries.

Check 2 — Summary counter integrity

These counters use mutual exclusion based on the boolean flags, which mirrors the highest-severity
```
Status
```
field. An entry with both
```
HasError: true
```
and
```
HasWarning: true
```
is counted only in
```
Errors
```
, never in
```
Warnings
```
. This is equivalent to counting by the entry's
```
Status
```
field.

Assert all of the following; correct any that fail before generating the report:

Errors == sum(1 for e in Entries if e['HasError'])

Warnings == sum(1 for e in Entries if e['HasWarning'] and not e['HasError'])

Suggestions == sum(1 for e in Entries if e['HasSuggestion'] and not e['HasError'] and not e['HasWarning'])

OK == sum(1 for e in Entries if not e['HasError'] and not e['HasWarning'] and not e['HasSuggestion'])

Errors + Warnings + Suggestions + OK == TotalEntries - InvalidEntries

Check 3 — Accuracy bucket integrity

Assert:

CityLevelAccuracy + RegionLevelAccuracy + CountryLevelAccuracy + DoNotGeolocate == TotalEntries - InvalidEntries

Note: The accuracy buckets defined in Phase 3 say "Do not count entries with
```
HasError: true
```
", but the Check 3 formula above uses
```
TotalEntries - InvalidEntries
```
(which still includes ERROR entries). This means ERROR entries (those that parsed as valid IPs but failed validation) are counted in accuracy buckets by their geo-field presence. Only
```
InvalidEntries
```
(unparsable IP prefixes) are excluded. Follow the Check 3 formula as the authoritative rule.
On failure, trace and fix the bucketing logic before proceeding.

Check 4 — No duplicate line numbers

Assert: all
```
Line
```
values in
```
Entries
```
are unique.
On failure, report the duplicated line numbers to the user.

Check 5 — TunedEntry completeness

Assert: every object in
```
Entries
```
has a
```
TunedEntry
```
key (even if its value is
```
{}
```
).
On failure, add
```
"TunedEntry": {}
```
to any entry missing the key, then re-save
```
report-data.json
```
.

Check 6 — Report file is present and non-empty

Confirm
```
./run/report/geofeed-report.html
```
was written and has a file size greater than zero bytes.
On failure, regenerate the report before presenting to the user.

在向用户呈现结果之前，使用具体、可检查的断言执行最终验证。

检查1 — 条目计数完整性

统计原始输入CSV中非注释、非空的数据行数。

断言：

report-data.json中的entries长度 == data_row_count

失败时：

行数不匹配：输入有{N}行数据，但报告包含{M}个条目。

检查2 — 汇总计数器完整性

这些计数器基于布尔标志互斥计数，与最高严重级别
```
Status
```
字段一致。同时
```
HasError: true
```
和
```
HasWarning: true
```
的条目仅计入
```
Errors
```
，不计入
```
Warnings
```
。这与按条目
```
Status
```
字段计数等效。

断言以下所有条件；生成报告前修正任何失败的条件：

Errors == sum(1 for e in Entries if e['HasError'])

Warnings == sum(1 for e in Entries if e['HasWarning'] and not e['HasError'])

Suggestions == sum(1 for e in Entries if e['HasSuggestion'] and not e['HasError'] and not e['HasWarning'])

OK == sum(1 for e in Entries if not e['HasError'] and not e['HasWarning'] and not e['HasSuggestion'])

Errors + Warnings + Suggestions + OK == TotalEntries - InvalidEntries

检查3 — 准确性分类完整性

断言：

CityLevelAccuracy + RegionLevelAccuracy + CountryLevelAccuracy + DoNotGeolocate == TotalEntries - InvalidEntries

**注意：**阶段3中定义的准确性分类规则指出“请勿计数
```
HasError: true
```
的条目”，但检查3的公式使用
```
TotalEntries - InvalidEntries
```
（仍包含ERROR条目）。这意味着ERROR条目（可解析为有效IP但验证失败的条目）会根据其地理字段存在情况计入准确性分类。仅排除
```
InvalidEntries
```
（无法解析的IP前缀）。以检查3的公式作为权威规则。
失败时，追溯并修正分类逻辑后再继续。

检查4 — 无重复行号

断言：
```
Entries
```
中的所有
```
Line
```
值都是唯一的。
失败时，向用户报告重复的行号。

检查5 — TunedEntry完整性

断言：
```
Entries
```
中的每个对象都有
```
TunedEntry
```
键（即使值为
```
{}
```
）。
失败时，为缺少该键的条目添加
```
"TunedEntry": {}
```
，然后重新保存
```
report-data.json
```
。

检查6 — 报告文件存在且非空

确认
```
./run/report/geofeed-report.html
```
已写入且文件大小大于0字节。
失败时，重新生成报告后再向用户呈现。