cel-programs

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

cel-programs

CEL程序开发

When to use

使用场景

Use this skill when tasks include:

creating or editing
```
cel.yml.hbs
```
agent stream templates
configuring data stream manifests for the
```
cel
```
input type
writing CEL programs with pagination, cursor management, or authentication
testing or debugging a CEL program locally with mito
setting up system tests with mock APIs for CEL-based data streams
prototyping a new CEL-based data stream's collection logic
any CEL or mito question, regardless of context

当任务包含以下内容时，请使用此技能：

创建或编辑
```
cel.yml.hbs
```
agent流模板
为
```
cel
```
输入类型配置数据流manifest
编写包含分页、游标管理或认证逻辑的CEL程序
使用mito在本地测试或调试CEL程序
为基于CEL的数据流设置带模拟API的系统测试
为新的基于CEL的数据流的收集逻辑制作原型
任何CEL或mito相关问题，无论上下文如何

When not to use

非使用场景

Do not use this skill as the primary guide for:

ingest pipeline processor design (
```
ingest-pipelines
```
)
ECS field mapping (
```
ecs-field-mappings
```
)
package scaffolding (
```
create-integration
```
)
system test execution with the Elastic stack (
```
integration-testing
```
→
```
references/system-testing.md
```
)

请勿将此技能作为以下工作的主要指南：

摄入管道处理器设计（
```
ingest-pipelines
```
）
ECS字段映射（
```
ecs-field-mappings
```
）
包脚手架搭建（
```
create-integration
```
）

使用Elastic栈执行系统测试（

integration-testing

→

references/system-testing.md

）

Mandatory workflow — mock → mito → template

强制工作流 — 模拟 → mito → 模板

This is not a suggestion. Every CEL program MUST be developed in this order. The subagent must not write

cel.yml.hbs

until the CEL program has been validated with mito against a running mock. Skipping steps or reordering causes failures that are hard to debug.

Do NOT write more than ~10–15 new lines of CEL before running mito. Build the program incrementally in phases (skeleton → error handling → event mapping → pagination → cursor guard), validating with mito after each phase. Writing a large program in one shot leads to cascading compilation errors that are extremely hard to debug. Follow the phased approach in

references/cel-incremental-build.md

Step	Action	Output
1. Create the system test mock	Write the `elastic/stream` config at `_dev/deploy/docker/files/config-<stream>.yml` with rules matching all API endpoints. Write `test-default-config.yml` .	Mock config file, docker-compose service, test config
2. Start the mock locally	`stream http-server --addr=:8090 --config=...`	Running mock at `http://localhost:8090`
3. Create a plain `.cel` file and `state.json`	Write the CEL program as a standalone `.cel` file. Create `state.json` with the same keys the future `state:` block will contain, but with literal test values instead of Handlebars. Point `url` at the local mock.	`program.cel` , `state.json` in `/tmp` or working dir
4. Run mito and iterate	Build incrementally per `references/cel-incremental-build.md` : Phase 0 skeleton → Phase 1 error handling → Phase 2 events → Phase 3 pagination → Phase 4 cursor. Run `mito -data state.json -log_requests program.cel` after each phase. Do not proceed until mito output is correct.	Validated CEL program
5. ONLY THEN write `cel.yml.hbs`	Copy the working CEL expression into `program: \|` in the Handlebars template. Replace literal test values with `{{var}}` references. Configure manifests.	Final integration template

Step 3 detail — translating template vars to mito state: When the future

cel.yml.hbs

will have a

state:

block like

api_key: {{api_key}}

and

batch_size: {{batch_size}}

, the

state.json

for mito testing uses the same key names with literal test values:

json

{
  "url": "http://localhost:8090",
  "api_key": "test-key",
  "batch_size": 50,
  "initial_interval": "24h"
}

This mirrors the runtime state the CEL input would provide. Add

cursor

to test subsequent-run behavior.

For the full mock-first workflow details, CLI flags, execution model, and quality standards: load

references/mito-reference.md

这不是建议。每个CEL程序都必须按照此顺序开发。 在CEL程序通过mito针对运行中的模拟环境验证之前，子代理不得编写

cel.yml.hbs

。跳过步骤或更改顺序会导致难以调试的故障。

在运行mito之前，请勿编写超过约10-15行新的CEL代码。 分阶段逐步构建程序（框架 → 错误处理 → 事件映射 → 分页 → 游标防护），每个阶段后使用mito验证。一次性编写大型程序会导致级联编译错误，极难调试。请遵循

references/cel-incremental-build.md

中的分阶段方法。

步骤	操作	输出
1. 创建系统测试模拟环境	在 `_dev/deploy/docker/files/config-<stream>.yml` 编写 `elastic/stream` 配置，包含匹配所有API端点的规则。编写 `test-default-config.yml` 。	模拟配置文件、docker-compose服务、测试配置
2. 启动本地模拟环境	`stream http-server --addr=:8090 --config=...`	在 `http://localhost:8090` 运行的模拟环境
3. 创建纯 `.cel` 文件和 `state.json`	将CEL程序编写为独立的 `.cel` 文件。创建 `state.json` ，包含未来 `state:` 块将有的相同键，但使用字面测试值而非Handlebars。将 `url` 指向本地模拟环境。	`/tmp` 或工作目录中的 `program.cel` 、 `state.json`
4. 运行mito并迭代	按照 `references/cel-incremental-build.md` 逐步构建：阶段0框架 → 阶段1错误处理 → 阶段2事件 → 阶段3分页 → 阶段4游标。每个阶段后运行 `mito -data state.json -log_requests program.cel` 。在mito输出正确之前，请勿继续。	已验证的CEL程序
5. 仅在此时编写 `cel.yml.hbs`	将可运行的CEL表达式复制到Handlebars模板的 `program: \|` 中。将字面测试值替换为 `{{var}}` 引用。配置manifest。	最终集成模板

步骤3细节 — 将模板变量转换为mito状态： 当未来的

cel.yml.hbs

将包含

state:

块（如

api_key: {{api_key}}

和

batch_size: {{batch_size}}

）时，用于mito测试的

state.json

使用相同的键名和字面测试值：

json

{
  "url": "http://localhost:8090",
  "api_key": "test-key",
  "batch_size": 50,
  "initial_interval": "24h"
}

这与CEL输入将提供的运行时状态一致。添加

cursor

以测试后续运行行为。

有关完整的先模拟后开发工作流细节、CLI标志、执行模型和质量标准：加载

references/mito-reference.md

。

cel.yml.hbs template anatomy

cel.yml.hbs模板结构

The

cel.yml.hbs

file at

data_stream/<stream>/agent/stream/cel.yml.hbs

is a Handlebars template that renders the final CEL input configuration. It has these sections in order:

yaml

interval: {{interval}}
resource.tracer:
  enabled: {{enable_request_tracer}}
  filename: "../../logs/cel/http-request-trace-*.ndjson"
  maxbackups: 5
{{#if proxy_url}}
resource.proxy_url: {{proxy_url}}
{{/if}}
{{#if ssl}}
resource.ssl: {{ssl}}
{{/if}}
{{#if http_client_timeout}}
resource.timeout: {{http_client_timeout}}
{{/if}}
resource.url: <constructed from vars>
state:
  <credentials and pagination config from vars>
redact:
  fields:
    - <sensitive state keys>
max_executions: <number, for heavy pagination>
program: |
  <CEL expression>
tags:
{{#if preserve_original_event}}
  - preserve_original_event
{{/if}}
{{#each tags as |tag|}}
  - {{tag}}
{{/each}}
{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}
{{#if processors}}
processors:
{{processors}}
{{/if}}

位于

data_stream/<stream>/agent/stream/cel.yml.hbs

的

cel.yml.hbs

文件是一个Handlebars模板，用于渲染最终的CEL输入配置。它按顺序包含以下部分：

yaml

interval: {{interval}}
resource.tracer:
  enabled: {{enable_request_tracer}}
  filename: "../../logs/cel/http-request-trace-*.ndjson"
  maxbackups: 5
{{#if proxy_url}}
resource.proxy_url: {{proxy_url}}
{{/if}}
{{#if ssl}}
resource.ssl: {{ssl}}
{{/if}}
{{#if http_client_timeout}}
resource.timeout: {{http_client_timeout}}
{{/if}}
resource.url: <constructed from vars>
state:
  <credentials and pagination config from vars>
redact:
  fields:
    - <sensitive state keys>
max_executions: <number, for heavy pagination>
program: |
  <CEL expression>
tags:
{{#if preserve_original_event}}
  - preserve_original_event
{{/if}}
{{#each tags as |tag|}}
  - {{tag}}
{{/each}}
{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}
{{#if processors}}
processors:
{{processors}}
{{/if}}

Handlebars patterns

Handlebars模式

Pattern	Purpose
`{{var_name}}`	Direct variable substitution
`{{#if var_name}}...{{/if}}`	Conditional block for optional config
`{{#each tags as \|tag\|}}`	Iteration over list vars
`{{#contains "forwarded" tags}}`	Check if list contains value

模式	用途
`{{var_name}}`	直接变量替换
`{{#if var_name}}...{{/if}}`	可选配置的条件块
`{{#each tags as \|tag\|}}`	遍历列表变量
`{{#contains "forwarded" tags}}`	检查列表是否包含指定值

Key template fields

关键模板字段

```
resource.url
```
— base URL, often constructed from multiple vars (e.g.,
```
{{url}}/api/v1/endpoint
```
)
```
resource.headers
```
(ga 8.18.1) — static headers the same for every request (
```
Content-Type
```
,
```
Accept
```
, API version headers). Set here rather than in-program when headers never vary. Applied before auth headers.
```
state:
```
— block where manifest vars are injected as CEL state; credentials and pagination settings go here
```
redact.fields
```
— list state keys containing secrets to redact from debug logs
```
max_executions
```
— override default 1000 for integrations with heavy pagination (e.g., 5000)
```
program: |
```
— the CEL expression; must be a YAML literal block scalar

```
resource.url
```
— 基础URL，通常由多个变量构造（例如
```
{{url}}/api/v1/endpoint
```
）
```
resource.headers
```
（ga 8.18.1） — 每个请求都相同的静态头（
```
Content-Type
```
、
```
Accept
```
、API版本头）。当头从不变化时，在此处设置而非在程序中设置。在认证头之前应用。
```
state:
```
— 将manifest变量注入为CEL状态的块；凭证和分页设置放在此处
```
redact.fields
```
— 包含敏感信息的状态键列表，将从调试日志中脱敏
```
max_executions
```
— 为具有大量分页的集成覆盖默认值1000（例如5000）
```
program: |
```
— CEL表达式；必须是YAML字面块标量

Do NOT set

data_stream.dataset

in integration packages

请勿在集成包中设置

data_stream.dataset

Integration packages (

type: integration

) must never include

data_stream.dataset

cel.yml.hbs

or define a

data_stream.dataset

manifest var. The framework automatically routes documents to the correct data stream. Setting

data_stream.dataset

overrides this routing and causes documents to land in the wrong index — typically resulting in "0 hits" during system tests.

Only input-type packages (

type: input

) use

data_stream.dataset

because they have no predefined data streams.

集成包（

type: integration

）必须绝不在

cel.yml.hbs

中包含

data_stream.dataset

，也不得定义

data_stream.dataset

manifest变量。框架会自动将文档路由到正确的数据流。设置

data_stream.dataset

会覆盖此路由，导致文档进入错误的索引——通常会在系统测试中导致“0命中”。

只有输入类型包（

type: input

）使用

data_stream.dataset

，因为它们没有预定义的数据流。

Data stream manifest configuration

数据流manifest配置

The data stream

manifest.yml

defines the CEL input stream and its variables.

数据流

manifest.yml

定义了CEL输入流及其变量。

Standard vars every CEL stream should include

每个CEL流应包含的标准变量

Var	Type	Purpose
`url`	text	API base URL
`interval`	text	Polling interval (e.g., `5m` )
`initial_interval`	text	Lookback window on first run (e.g., `24h` )
`enable_request_tracer`	bool	Enable HTTP request tracing
`http_client_timeout`	text	Request timeout (e.g., `30s` )
`proxy_url`	text	HTTP proxy URL
`ssl`	yaml	TLS configuration
`tags`	text (multi)	Event tags
`preserve_original_event`	bool	Keep original event
`processors`	yaml	Beat processors

Auth-specific vars depend on the API (API key, OAuth client_id/secret/token_url, bearer token, etc.).

Declare

enable_request_tracer

in the data stream manifest, not at the input level. Input-level tracing enables logging for all data streams in the policy.

变量	类型	用途
`url`	文本	API基础URL
`interval`	文本	轮询间隔（例如 `5m` ）
`initial_interval`	文本	首次运行的回溯窗口（例如 `24h` ）
`enable_request_tracer`	布尔值	启用HTTP请求追踪
`http_client_timeout`	文本	请求超时（例如 `30s` ）
`proxy_url`	文本	HTTP代理URL
`ssl`	yaml	TLS配置
`tags`	文本（多值）	事件标签
`preserve_original_event`	布尔值	保留原始事件
`processors`	yaml	Beat处理器

认证相关变量取决于API（API密钥、OAuth client_id/secret/token_url、Bearer令牌等）。

在数据流manifest中声明

enable_request_tracer

，而非在输入级别。输入级别追踪会启用策略中所有数据流的日志记录。

Package-level vs data-stream-level vars

包级别与数据流级别变量

Package-level vars in the root
```
manifest.yml
```
under
```
policy_templates[].inputs[].vars
```
: shared across streams (e.g.,
```
url
```
, auth credentials)

Data-stream-level vars in

data_stream/<stream>/manifest.yml

under

streams[].vars

: stream-specific (e.g.,

interval

batch_size

initial_interval

)

包级别变量：位于根
```
manifest.yml
```
的
```
policy_templates[].inputs[].vars
```
下；在所有流之间共享（例如
```
url
```
、认证凭证）

数据流级别变量：位于

data_stream/<stream>/manifest.yml

的

streams[].vars

下；特定于流（例如

interval

、

batch_size

、

initial_interval

）

Scope of the CEL program

CEL程序的职责范围

The CEL program's responsibility is data collection only:

Fetch data from the API endpoint(s)
Handle pagination — walk through all pages within a single polling cycle
Manage cursor state — store timestamps or page tokens in
```
cursor
```
so the next polling interval resumes where the last one left off, avoiding re-collection of already-fetched events
Emit raw events — output
```
{"message": e.encode_json()}
```
for each record

The CEL program does not handle:

Elasticsearch-level deduplication — if overlapping time windows cause a few duplicate events to be collected, that is acceptable. The ingest pipeline or Elasticsearch
```
_id
```
routing handles dedup at index time, not the CEL program.
Field mapping or transformation — the ingest pipeline handles parsing, ECS mapping, and enrichment.
Filtering by content — unless the API supports server-side filtering parameters, do not filter events in the CEL program. Emit everything and let the pipeline decide.

Do not search the codebase for

_id

document_id

, or deduplication patterns. These are not CEL concerns.

CEL程序的职责仅为数据收集：

从API端点获取数据
处理分页 — 在单个轮询周期内遍历所有页面
管理游标状态 — 在
```
cursor
```
中存储时间戳或页面令牌，以便下一个轮询间隔从上次结束的位置继续，避免重新收集已获取的事件
输出原始事件 — 为每条记录输出
```
{"message": e.encode_json()}
```

CEL程序不处理：

Elasticsearch级别去重 — 如果重叠时间窗口导致收集到少量重复事件，这是可接受的。摄入管道或Elasticsearch的
```
_id
```
路由会在索引时处理去重，而非CEL程序。
字段映射或转换 — 摄入管道处理解析、ECS映射和 enrichment。
按内容过滤 — 除非API支持服务器端过滤参数，否则请勿在CEL程序中过滤事件。输出所有内容，让管道决定。

请勿在代码库中搜索

_id

、

document_id

或去重模式。这些不属于CEL的职责范围。

CEL program structure patterns

CEL程序结构模式

Pagination strategy selection

分页策略选择

API behavior	Pattern	Key indicators
Returns total count + supports offset	Offset pagination	`total_count` , `offset` , `limit` in request/response
Returns records since a timestamp	Timestamp cursor	Time-range params, no explicit page tokens
Returns `Link` header with next URL	Link header	`Link: <url>; rel="next"` in response headers
Returns next-page URL in response body	Next-URL	`next` , `nextLink` , `@odata.nextLink` field in JSON
GraphQL with `pageInfo`	GraphQL cursor	`hasNextPage` , `endCursor` in `pageInfo` object
Multi-phase subscription/content flow	Multi-step state machine	Multiple API calls with work queues in state

Cursor timestamp selection — use the last record's timestamp when the API sorts ascending; first when descending;

max()

with a regression guard when sort order is not guaranteed.

Full code, package references, and YAML snippets for each pattern:

references/cel-pagination-patterns.md

API行为	模式	关键指标
返回总数 + 支持偏移量	偏移量分页	请求/响应中的 `total_count` 、 `offset` 、 `limit`
返回时间戳之后的记录	时间戳游标	时间范围参数，无显式页面令牌
返回带有下一页URL的 `Link` 头	Link头	响应头中的 `Link: <url>; rel="next"`
在响应体中返回下一页URL	下一页URL	JSON中的 `next` 、 `nextLink` 、 `@odata.nextLink` 字段
带有 `pageInfo` 的GraphQL	GraphQL游标	`pageInfo` 对象中的 `hasNextPage` 、 `endCursor`
多阶段订阅/内容流	多步骤状态机	状态中包含工作队列的多个API调用

游标时间戳选择 — 当API按升序排序时，使用最后一条记录的时间戳；按降序排序时使用第一条；当排序顺序不保证时，使用

max()

并添加回归防护。

每个模式的完整代码、包引用和YAML片段：

references/cel-pagination-patterns.md

。

Authentication patterns

认证模式

Three strategies: header (credentials in

state:

, passed via

Header

map), query parameter (credentials appended to URL via

.format_query()

), signed query (HMAC signature computed in CEL). Config-level

auth.oauth2

auth.digest

auth.aws

applies to all requests including

.do_request()

;

auth.basic

auth.token

applies only to direct calls (

get()

post()

). Prefer input-level auth over in-program token fetching.

For full code examples, optional-header syntax, and config-level auth scope details: load

references/cel-auth-patterns.md

三种策略：头认证（凭证在

state:

中，通过

Header

映射传递）、查询参数认证（凭证通过

.format_query()

附加到URL）、签名查询认证（在CEL中计算HMAC签名）。配置级别的

auth.oauth2

auth.digest

auth.aws

适用于所有请求，包括

.do_request()

；

auth.basic

auth.token

仅适用于直接调用（

get()

、

post()

）。优先使用输入级认证而非程序内令牌获取。

有关完整代码示例、可选头语法和配置级认证范围细节：加载

references/cel-auth-patterns.md

。

State management rules

状态管理规则

state.url
is populated from
```
resource.url
```
config; must be preserved in output or hardcoded
cursor
is the only state persisted across input restarts; store pagination positions and timestamps here
events
array is removed after each evaluation; never rely on it in subsequent runs
want_more: true
triggers immediate re-evaluation, but only if
```
events
```
is non-empty. Pagination continuation guardrail: when a next-page cursor/token exists, always set
```
want_more: true
```
regardless of how many events were collected on the current page. Tying
```
want_more
```
to
```
size(events) > 0
```
stalls pagination silently — the next cursor is valid, and an empty
```
events
```
array is safe to emit. The correct pattern is
```
"want_more": next_cursor != ""
```
.
All other state keys are retained within a session but lost on restart — use
```
state.with()
```
to propagate them automatically
Numbers are serialized as floats in state JSON; cast with
```
int()
```
when using as integers
Optional access with
```
state.?cursor.last_timestamp.orValue(default)
```
prevents errors when cursor is absent
Secrets — every sensitive field in
```
state
```
must have a corresponding
```
redact
```
entry.
```
state.secret
```
is always redacted automatically. When
```
secret_state
```
(ga 9.4.0) is available, prefer it.
Cursor updates require a published event — the input only persists cursor updates when at least one event is published. If a program updates the cursor but returns zero events, the cursor change is lost.
Do not duplicate request/response handling across branches — when an initialization branch (cursor creation, subscription, token exchange) and a steady-state branch both need the same fetch logic, consolidate it. Two approaches: split the init into a separate evaluation via
```
want_more: true
```
(Technique 6 Variant A), or use an intermediate result map to unify the branches within one evaluation (Variant B). Both are valid — see
```
references/cel-code-style.md
```
Technique 6 and the init-then-steady-state pattern in
```
references/cel-pagination-patterns.md
```
.
Nesting depth —
```
.as()
```
chain depth must not exceed 5 levels on any execution path. HTTP programs must target 2 levels inside
```
state.with()
```
(
```
resp
```
+
```
body
```
). Cursor defaults, window bounds, and page tokens must be extracted as pre-bindings before
```
state.with()
```
. Single-use values such as
```
int(state.batch_size)
```
must be inlined at the call site, not wrapped in
```
.as()
```
. Load
```
references/cel-code-style.md
```
for flattening techniques and before/after examples.

state.url
由
```
resource.url
```
配置填充；必须在输出中保留或硬编码
cursor
是输入重启后唯一持久化的状态；在此存储分页位置和时间戳
events
数组在每次评估后被移除；切勿在后续运行中依赖它
want_more: true
触发立即重新评估，但仅当
```
events
```
非空时。分页继续防护： 当存在下一页游标/令牌时，无论当前页收集了多少事件，始终设置
```
want_more: true
```
。将
```
want_more
```
与
```
size(events) > 0
```
关联会导致分页静默停滞——下一个游标是有效的，空的
```
events
```
数组可以安全输出。正确的模式是
```
"want_more": next_cursor != ""
```
。
所有其他状态键 在会话内保留，但重启后丢失——使用
```
state.with()
```
自动传播它们
数字在状态JSON中序列化为浮点数；当用作整数时，使用
```
int()
```
转换
可选访问 使用
```
state.?cursor.last_timestamp.orValue(default)
```
可防止游标不存在时出错
敏感信息 —
```
state
```
中的每个敏感字段都必须有对应的
```
redact
```
条目。
```
state.secret
```
始终自动脱敏。当
```
secret_state
```
（ga 9.4.0）可用时，优先使用它。
游标更新需要发布事件 — 只有当至少发布一个事件时，输入才会持久化游标更新。如果程序更新了游标但返回零事件，游标更改将丢失。
请勿在分支间重复请求/响应处理 — 当初始化分支（游标创建、订阅、令牌交换）和稳态分支都需要相同的获取逻辑时，将其合并。两种方法：通过
```
want_more: true
```
将初始化拆分为单独的评估（技术6变体A），或使用中间结果映射在一次评估内统一分支（变体B）。两种方法均有效——请参阅
```
references/cel-code-style.md
```
中的技术6和
```
references/cel-pagination-patterns.md
```
中的初始化后稳态模式。
嵌套深度 — 在任何执行路径上，
```
.as()
```
链的深度不得超过5级。HTTP程序必须在
```
state.with()
```
内控制在2级（
```
resp
```
+
```
body
```
）。游标默认值、窗口边界和页面令牌必须在
```
state.with()
```
之前提取为预绑定。单次使用的值（如
```
int(state.batch_size)
```
）必须内联在调用点，而非包装在
```
.as()
```
中。加载
```
references/cel-code-style.md
```
以获取扁平化技术和前后示例。

Map merge and field removal

映射合并与字段移除

with()

with_replace()

with_update()

, and

drop()

are general-purpose map operations — they work on any map, not just state or cursor.

with()

does a shallow merge: nested objects are replaced entirely. This makes it a tool for cursor state transitions (omitting a sub-object removes it via clobber) as well as for building request headers, transforming response data, and constructing intermediate maps. Full semantics and examples:

references/cel-code-style.md

with()

、

with_replace()

、

with_update()

和

drop()

是通用映射操作——它们适用于任何映射，不仅是state或cursor。

with()

执行浅合并：嵌套对象被完全替换。这使其成为游标状态转换（省略子对象通过覆盖将其移除）以及构建请求头、转换响应数据和构造中间映射的工具。完整语义和示例：

references/cel-code-style.md

。

Event output format

事件输出格式

Events must contain ONLY
"message"
—

{"message": e.encode_json()}

. Do not set

@timestamp

or any other field; the framework adds

@timestamp

, and duplicates cause silent document rejection in ES 9.x. See

references/cel-idioms.md

for correct/incorrect examples.

事件必须仅包含
"message"
—

{"message": e.encode_json()}

。请勿设置

@timestamp

或任何其他字段；框架会添加

@timestamp

，重复设置会导致ES 9.x中静默拒绝文档。有关正确/错误示例，请参阅

references/cel-idioms.md

。

Response handling

响应处理

resp.Body.decode_json()
— the

bytes(resp.Body)

wrapper was required in older runtime versions but is no longer needed. Use

resp.Body.decode_json()

directly.

resp.Body.decode_json()
— 旧版本运行时需要

bytes(resp.Body)

包装器，但现在不再需要。直接使用

resp.Body.decode_json()

。

Error handling

错误处理

Every CEL program must handle HTTP errors. Two forms:

Single-object error (retry):
```
"events": {"error": {...}}
```
— logs at ERROR, sets degraded status, deletes the cursor so the next evaluation retries. Use when data was not collected.
Array error (advance):
```
"events": [{"error": {...}}]
```
— cursor is updated. Use when the program should advance past the error. Requires a
```
terminate
```
processor in the ingest pipeline (ES 8.16.0+).

Error message format:

"METHOD path: body-or-status"

. Code examples:

references/cel-idioms.md

每个CEL程序都必须处理HTTP错误。两种形式：

单对象错误（重试）：
```
"events": {"error": {...}}
```
— 在ERROR级别记录日志，设置降级状态，删除游标以便下一次评估重试。当未收集到数据时使用。
数组错误（推进）：
```
"events": [{"error": {...}}]
```
— 游标会被更新。当程序应跳过错误继续时使用。需要摄入管道中的
```
terminate
```
处理器（ES 8.16.0+）。

错误消息格式：

"METHOD path: body-or-status"

。代码示例：

references/cel-idioms.md

。

Placeholder events

占位符事件

When advancing the cursor with no real events, emit a placeholder (

[{"retry": true}]

) and add a

- drop_event.when.equals.retry: true

entry in the

processors:

section so it is discarded before indexing. Full pattern and alternatives:

references/cel-idioms.md

当没有真实事件但需要推进游标时，输出占位符（

[{"retry": true}]

）并在

processors:

部分添加

- drop_event.when.equals.retry: true

条目，以便在索引前将其丢弃。完整模式和替代方案：

references/cel-idioms.md

。

Rate limiting and retry

速率限制与重试

Do NOT implement rate limiting or retry logic in the CEL program. No

rate_limit()

calls, no

"rate_limit"

state propagation, no 429-specific branches, no retry loops. These add excessive nesting and complexity for marginal benefit.

When an API has a documented rate limit, use config-only YAML settings in

cel.yml.hbs

yaml

resource.rate_limit.limit: 10   # max requests per second
resource.rate_limit.burst: 5    # max burst above sustained rate

When custom retry behavior is needed, use config-only YAML settings:

yaml

resource.retry.max_attempts: 5    # default: 5
resource.retry.wait_min: 1s       # default: 1s
resource.retry.wait_max: 60s      # default: 60s

The input framework enforces both transparently. See

references/cel-rate-limiting.md

for guidance on when to add these settings.

请勿在CEL程序中实现速率限制或重试逻辑。 不要调用

rate_limit()

，不要传播

"rate_limit"

状态，不要处理429特定分支，不要使用重试循环。这些会增加过多的嵌套和复杂性，而收益甚微。

当API有文档化的速率限制时，在

cel.yml.hbs

中使用仅配置的YAML设置：

yaml

resource.rate_limit.limit: 10   # 每秒最大请求数
resource.rate_limit.burst: 5    # 超过持续速率的最大突发请求数

当需要自定义重试行为时，使用仅配置的YAML设置：

yaml

resource.retry.max_attempts: 5    # 默认值：5
resource.retry.wait_min: 1s       # 默认值：1s
resource.retry.wait_max: 60s      # 默认值：60s

输入框架会透明地执行这些设置。有关何时添加这些设置的指南，请参阅

references/cel-rate-limiting.md

。

Type safety

类型安全

Avoid
dyn()
— defeats type checking. Rarely needed in practice.

All numbers are float64 — the CEL input transmits all numbers as

float64

. Numbers >=1e7 render in scientific notation in Elasticsearch. Convert intended-integer fields to strings in the CEL program or via ingest pipeline. Safe integer range: [-(2^53 - 1), 2^53 - 1].

避免使用
dyn()
— 会破坏类型检查。实际上很少需要。

所有数字都是float64 — CEL输入将所有数字作为

float64

传输。大于等于1e7的数字在Elasticsearch中会以科学计数法显示。在CEL程序或摄入管道中将预期为整数的字段转换为字符串。安全整数范围：[-(2^53 - 1), 2^53 - 1]。

Debugging aids

调试工具

debug(tag, value)
— logs to
```
cel_debug
```
at DEBUG level.
try(expr)
/
is_error(value)
— structured error handling without crashing the program.
failure_dump
(ga 8.18.0) — full evaluation state dump on failure. Note: dumps may contain secrets.
remaining_executions
(ga 9.2) — how many evaluations remain in the
```
max_executions
```
budget.

debug(tag, value)
— 在DEBUG级别记录到
```
cel_debug
```
。
try(expr)
/
is_error(value)
— 结构化错误处理，不会导致程序崩溃。
failure_dump
（ga 8.18.0） — 失败时转储完整评估状态。注意：转储可能包含敏感信息。
remaining_executions
（ga 9.2） —
```
max_executions
```
预算中剩余的评估次数。

Mito CLI

Mito (

github.com/elastic/mito

) is the local CEL evaluation CLI. A CEL program that has not been tested with mito is not acceptable. Follow the mandatory workflow at the top of this skill: mock → mito → template. Do NOT write

cel.yml.hbs

until the program passes mito validation.

For installation, CLI flags, input state structure, execution model, the full mock-first workflow steps, mito→integration mapping, and quality standards: load

references/mito-reference.md

Mito（

github.com/elastic/mito

）是本地CEL评估CLI。未通过mito测试的CEL程序是不可接受的。 请遵循本技能顶部的强制工作流：模拟 → mito → 模板。在程序通过mito验证之前，请勿编写

cel.yml.hbs

。

有关安装、CLI标志、输入状态结构、执行模型、完整的先模拟后开发工作流步骤、mito到集成的映射以及质量标准：加载

references/mito-reference.md

。

Data anonymization

数据匿名化

All data committed to the repository must be fully anonymized. This applies to default values in manifest vars (use

https://api.example.com

), example values in CEL state, mock API responses, pipeline test fixtures from CEL output, and sample payloads captured during mito prototyping.

Refer to the

anonymize-logs

skill for the full anonymization policy and placeholder conventions.

提交到仓库的所有数据必须完全匿名化。 这适用于manifest变量中的默认值（使用

https://api.example.com

）、CEL状态中的示例值、模拟API响应、CEL输出的管道测试 fixture，以及mito原型制作期间捕获的示例负载。

有关完整的匿名化策略和占位符约定，请参阅

anonymize-logs

技能。

Handoff to other skills

移交到其他技能

```
integration-testing
```
→
```
references/system-testing.md
```
to run system tests (the mock API is already in place from CEL development)
```
integration-testing
```
→
```
references/pipeline-testing.md
```
to validate ingest pipeline behavior on CEL-produced events
```
create-integration
```
skill for overall package layout

```
integration-testing
```
→
```
references/system-testing.md
```
运行系统测试（CEL开发阶段已准备好模拟API）
```
integration-testing
```
→
```
references/pipeline-testing.md
```
验证摄入管道对CEL生成事件的处理行为
```
create-integration
```
技能用于整体包布局

Reference files

参考文件

IMPORTANT: These reference files contain the actual working code examples and patterns. The summaries above are not sufficient to write correct CEL programs — you MUST load the relevant references before writing code.

Always load these five when building a CEL program — in this order (mock/mito before templates):

File	Contains	Load order
`references/cel-system-tests.md`	Mock API setup with elastic/stream, docker-compose config, rule format, variable-capture patterns, GraphQL mock examples, hit_count calculation, and debugging 0-hits failures	1st — you need this before writing any CEL
`references/cel-incremental-build.md`	Mandatory phased build ladder (skeleton → error handling → events → pagination → cursor), syntax anti-patterns that cause compilation failures ( `bytes()` , `parse_time()` , tuples, unbalanced parens), and debugging guidance	2nd — you MUST follow this phased approach; do not write the full program before validating a skeleton
`references/mito-reference.md`	Mito CLI flags, input state structure, mock-first workflow, translating template vars to state.json, extension library quick-reference, syntax pitfalls, testscript harness	3rd — you need this to develop and validate the program
`references/cel-template-examples.md`	Complete working `cel.yml.hbs` examples (minimal GET, paginated timestamp cursor, OAuth, GraphQL cursor) with corresponding manifest configs — these are FINAL output; do not write templates until mito passes	4th — only needed at step 5 of the workflow
`references/cel-code-style.md`	Nesting discipline: the 3-level HTTP core rule, six flattening/structuring techniques (including intermediate result maps for shared logic), shallow merge semantics, cursor namespacing with clobber, merge strategies ( `with` / `with_replace` / `with_update` ), `drop()` , and links to well-structured reference integrations — must read before writing any multi-line CEL	5th — read this before writing your CEL program so structure is right from the start

Load these based on the task:

File	Load when
`references/cel-pagination-patterns.md`	Writing any pagination logic — all 6 patterns with code
`references/cel-auth-patterns.md`	Implementing authentication — header, query param, signed, and config-level auth patterns
`references/cel-rate-limiting.md`	Rate limiting policy — config-only approach, when to add `resource.rate_limit.` and `resource.retry.` settings
`references/cel-idioms.md`	Quick-reference for common idioms, HTTP request patterns, structure conventions
`references/cel-polymorphic-patterns.md`	Choosing between pure-CEL, mito lib, and config approaches for auth, headers, rate limiting — version-tagged
`references/cel-expression.md`	Expression-specific reference: interface contract, translation framing (Python→CEL), incremental build phases, core structure, event output, error handling, pagination, state management, syntax rules, quality checklist
`references/cel-taxonomy.md`	Taxonomy classification: pagination and state management classes, least-complexity principle, mapping to skill vocabulary, how to classify from test-api.py
`references/cel-complexity-baselines.md`	Per-pattern-class complexity baselines from a ceplx survey of 316 programs, skip threshold, reviewer challenge examples, diagnostic interpretation
`references/expression-builder-subagent-guidance.md`	Subagent operating manual for the cel-expression-builder: translates test-api.py into a validated `.cel` file + taxonomy classification. Does not touch templates, manifests, or mocks.
`references/reviewer-subagent-guidance.md`	Subagent operating manual for the cel-expression-reviewer: checks generated CEL against complexity baselines and source fidelity, produces specific challenges or accepts
`references/cel-function-reference.md`	Looking up available CEL functions per extension and their first mito version
`references/builder-subagent-guidance.md`	Subagent operating manual for the cel-program-builder orchestrator: scope boundaries, skill-load sequence, the 9-step mock-first / mito-incremental workflow with mock completeness gate, delegation to cel-expression-builder, reporting contract. The orchestrator dispatches subagents by passing this file's path in the task prompt; the subagent reads it itself in its own fresh context. Do NOT embed/paste its contents into the task prompt.

重要提示：这些参考文件包含实际可运行的代码示例和模式。上面的摘要不足以编写正确的CEL程序——在编写代码之前，必须加载相关参考文件。

构建CEL程序时，请始终按以下顺序加载这五个文件（先模拟/mito，后模板）：

文件	包含内容	加载顺序
`references/cel-system-tests.md`	使用elastic/stream设置模拟API、docker-compose配置、规则格式、变量捕获模式、GraphQL模拟示例、hit_count计算，以及调试0命中故障	第1个 — 在编写任何CEL之前需要此文件
`references/cel-incremental-build.md`	强制分阶段构建步骤（框架 → 错误处理 → 事件 → 分页 → 游标）、导致编译失败的语法反模式（ `bytes()` 、 `parse_time()` 、元组、不平衡括号），以及调试指南	第2个 — 必须遵循此分阶段方法；在验证框架之前请勿编写完整程序
`references/mito-reference.md`	Mito CLI标志、输入状态结构、先模拟后开发工作流、将模板变量转换为state.json、扩展库快速参考、语法陷阱、testscript工具	第3个 — 需要此文件来开发和验证程序
`references/cel-template-examples.md`	完整可运行的 `cel.yml.hbs` 示例（最小GET、带时间戳游标的分页、OAuth、GraphQL游标）以及对应的manifest配置 — 这些是最终输出；在mito通过之前请勿编写模板	第4个 — 仅在工作流的步骤5中需要
`references/cel-code-style.md`	嵌套规范：3级HTTP核心规则、六种扁平化/结构化技术（包括用于共享逻辑的中间结果映射）、浅合并语义、使用覆盖的游标命名空间、合并策略（ `with` / `with_replace` / `with_update` ）、 `drop()` ，以及结构良好的参考集成链接 — 在编写任何多行CEL之前必须阅读	第5个 — 在编写CEL程序之前阅读此文件，确保从一开始结构就正确

根据任务加载以下文件：

文件	加载时机
`references/cel-pagination-patterns.md`	编写任何分页逻辑时 — 包含所有6种模式的代码
`references/cel-auth-patterns.md`	实现认证时 — 头、查询参数、签名和配置级认证模式
`references/cel-rate-limiting.md`	速率限制策略 — 仅配置方法、何时添加 `resource.rate_limit.` 和 `resource.retry.` 设置
`references/cel-idioms.md`	常见习语、HTTP请求模式、结构约定的快速参考
`references/cel-polymorphic-patterns.md`	为认证、头、速率限制选择纯CEL、mito库和配置方法 — 带版本标记
`references/cel-expression.md`	表达式特定参考：接口契约、翻译框架（Python→CEL）、分阶段构建步骤、核心结构、事件输出、错误处理、分页、状态管理、语法规则、质量检查表
`references/cel-taxonomy.md`	分类学分类：分页和状态管理类别、最低复杂度原则、与技能词汇的映射、如何通过test-api.py分类
`references/cel-complexity-baselines.md`	基于316个程序的ceplx调查得出的每个模式类别的复杂度基线、跳过阈值、评审挑战示例、诊断解释
`references/expression-builder-subagent-guidance.md`	cel-expression-builder子代理操作手册：将test-api.py转换为已验证的 `.cel` 文件 + 分类学分类。不涉及模板、manifest或模拟。
`references/reviewer-subagent-guidance.md`	cel-expression-reviewer子代理操作手册：检查生成的CEL是否符合复杂度基线和源保真度，生成特定挑战或接受
`references/cel-function-reference.md`	查找每个扩展可用的CEL函数及其首个mito版本
`references/builder-subagent-guidance.md`	cel-program-builder编排器子代理操作手册：范围边界、技能加载序列、带模拟完整性门的9步先模拟后开发/mito分阶段工作流、委托给cel-expression-builder、报告契约。编排器通过在任务提示中传递此文件的路径来调度子代理；子代理在自己的新上下文中自行读取它。请勿将其内容嵌入/粘贴到任务提示中。

另请参阅：CEL输入文档 · Mito库文档 · Mito仓库 · CEL语言规范

cel-programs

Original

Translation

cel-programs

CEL程序开发

When to use

使用场景

When not to use

非使用场景

Mandatory workflow — mock → mito → template

强制工作流 — 模拟 → mito → 模板

cel.yml.hbs template anatomy

cel.yml.hbs模板结构

Handlebars patterns

Handlebars模式

Key template fields

关键模板字段

Do NOT set data_stream.dataset in integration packages

请勿在集成包中设置data_stream.dataset

Data stream manifest configuration

数据流manifest配置

Standard vars every CEL stream should include

每个CEL流应包含的标准变量

Package-level vs data-stream-level vars

包级别与数据流级别变量

Scope of the CEL program

CEL程序的职责范围

CEL program structure patterns

CEL程序结构模式

Pagination strategy selection

分页策略选择

Authentication patterns

认证模式

State management rules

状态管理规则

Map merge and field removal

映射合并与字段移除

Event output format

事件输出格式

Response handling

响应处理

Error handling

错误处理

Placeholder events

占位符事件

Rate limiting and retry

速率限制与重试

Type safety

类型安全

Debugging aids

调试工具

Mito CLI

Mito CLI

Data anonymization

数据匿名化

Handoff to other skills

移交到其他技能

Reference files

参考文件

Do NOT set
`data_stream.dataset`
in integration packages

请勿在集成包中设置
`data_stream.dataset`