cel-programs

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

cel-programs

CEL程序开发

When to use

使用场景

Use this skill when tasks include:
  • creating or editing
    cel.yml.hbs
    agent stream templates
  • configuring data stream manifests for the
    cel
    input type
  • writing CEL programs with pagination, cursor management, or authentication
  • testing or debugging a CEL program locally with mito
  • setting up system tests with mock APIs for CEL-based data streams
  • prototyping a new CEL-based data stream's collection logic
  • any CEL or mito question, regardless of context
当任务包含以下内容时,请使用此技能:
  • 创建或编辑
    cel.yml.hbs
    agent流模板
  • cel
    输入类型配置数据流manifest
  • 编写包含分页、游标管理或认证逻辑的CEL程序
  • 使用mito在本地测试或调试CEL程序
  • 为基于CEL的数据流设置带模拟API的系统测试
  • 为新的基于CEL的数据流的收集逻辑制作原型
  • 任何CEL或mito相关问题,无论上下文如何

When not to use

非使用场景

Do not use this skill as the primary guide for:
  • ingest pipeline processor design (
    ingest-pipelines
    )
  • ECS field mapping (
    ecs-field-mappings
    )
  • package scaffolding (
    create-integration
    )
  • system test execution with the Elastic stack (
    integration-testing
    references/system-testing.md
    )
请勿将此技能作为以下工作的主要指南:
  • 摄入管道处理器设计(
    ingest-pipelines
  • ECS字段映射(
    ecs-field-mappings
  • 包脚手架搭建(
    create-integration
  • 使用Elastic栈执行系统测试(
    integration-testing
    references/system-testing.md

Mandatory workflow — mock → mito → template

强制工作流 — 模拟 → mito → 模板

This is not a suggestion. Every CEL program MUST be developed in this order. The subagent must not write
cel.yml.hbs
until the CEL program has been validated with mito against a running mock. Skipping steps or reordering causes failures that are hard to debug.
Do NOT write more than ~10–15 new lines of CEL before running mito. Build the program incrementally in phases (skeleton → error handling → event mapping → pagination → cursor guard), validating with mito after each phase. Writing a large program in one shot leads to cascading compilation errors that are extremely hard to debug. Follow the phased approach in
references/cel-incremental-build.md
.
StepActionOutput
1. Create the system test mockWrite the
elastic/stream
config at
_dev/deploy/docker/files/config-<stream>.yml
with rules matching all API endpoints. Write
test-default-config.yml
.
Mock config file, docker-compose service, test config
2. Start the mock locally
stream http-server --addr=:8090 --config=...
Running mock at
http://localhost:8090
3. Create a plain
.cel
file and
state.json
Write the CEL program as a standalone
.cel
file. Create
state.json
with the same keys the future
state:
block will contain, but with literal test values instead of Handlebars. Point
url
at the local mock.
program.cel
,
state.json
in
/tmp
or working dir
4. Run mito and iterateBuild incrementally per
references/cel-incremental-build.md
: Phase 0 skeleton → Phase 1 error handling → Phase 2 events → Phase 3 pagination → Phase 4 cursor. Run
mito -data state.json -log_requests program.cel
after each phase. Do not proceed until mito output is correct.
Validated CEL program
5. ONLY THEN write
cel.yml.hbs
Copy the working CEL expression into
program: |
in the Handlebars template. Replace literal test values with
{{var}}
references. Configure manifests.
Final integration template
Step 3 detail — translating template vars to mito state: When the future
cel.yml.hbs
will have a
state:
block like
api_key: {{api_key}}
and
batch_size: {{batch_size}}
, the
state.json
for mito testing uses the same key names with literal test values:
json
{
  "url": "http://localhost:8090",
  "api_key": "test-key",
  "batch_size": 50,
  "initial_interval": "24h"
}
This mirrors the runtime state the CEL input would provide. Add
cursor
to test subsequent-run behavior.
For the full mock-first workflow details, CLI flags, execution model, and quality standards: load
references/mito-reference.md
.

这不是建议。每个CEL程序都必须按照此顺序开发。 在CEL程序通过mito针对运行中的模拟环境验证之前,子代理不得编写
cel.yml.hbs
。跳过步骤或更改顺序会导致难以调试的故障。
在运行mito之前,请勿编写超过约10-15行新的CEL代码。 分阶段逐步构建程序(框架 → 错误处理 → 事件映射 → 分页 → 游标防护),每个阶段后使用mito验证。一次性编写大型程序会导致级联编译错误,极难调试。请遵循
references/cel-incremental-build.md
中的分阶段方法。
步骤操作输出
1. 创建系统测试模拟环境
_dev/deploy/docker/files/config-<stream>.yml
编写
elastic/stream
配置,包含匹配所有API端点的规则。编写
test-default-config.yml
模拟配置文件、docker-compose服务、测试配置
2. 启动本地模拟环境
stream http-server --addr=:8090 --config=...
http://localhost:8090
运行的模拟环境
3. 创建纯
.cel
文件和
state.json
将CEL程序编写为独立的
.cel
文件。创建
state.json
,包含未来
state:
块将有的相同键,但使用字面测试值而非Handlebars。将
url
指向本地模拟环境。
/tmp
或工作目录中的
program.cel
state.json
4. 运行mito并迭代按照
references/cel-incremental-build.md
逐步构建:阶段0框架 → 阶段1错误处理 → 阶段2事件 → 阶段3分页 → 阶段4游标。每个阶段后运行
mito -data state.json -log_requests program.cel
。在mito输出正确之前,请勿继续。
已验证的CEL程序
5. 仅在此时编写
cel.yml.hbs
将可运行的CEL表达式复制到Handlebars模板的
program: |
中。将字面测试值替换为
{{var}}
引用。配置manifest。
最终集成模板
步骤3细节 — 将模板变量转换为mito状态: 当未来的
cel.yml.hbs
将包含
state:
块(如
api_key: {{api_key}}
batch_size: {{batch_size}}
)时,用于mito测试的
state.json
使用相同的键名和字面测试值:
json
{
  "url": "http://localhost:8090",
  "api_key": "test-key",
  "batch_size": 50,
  "initial_interval": "24h"
}
这与CEL输入将提供的运行时状态一致。添加
cursor
以测试后续运行行为。
有关完整的先模拟后开发工作流细节、CLI标志、执行模型和质量标准:加载
references/mito-reference.md

cel.yml.hbs template anatomy

cel.yml.hbs模板结构

The
cel.yml.hbs
file at
data_stream/<stream>/agent/stream/cel.yml.hbs
is a Handlebars template that renders the final CEL input configuration. It has these sections in order:
yaml
interval: {{interval}}
resource.tracer:
  enabled: {{enable_request_tracer}}
  filename: "../../logs/cel/http-request-trace-*.ndjson"
  maxbackups: 5
{{#if proxy_url}}
resource.proxy_url: {{proxy_url}}
{{/if}}
{{#if ssl}}
resource.ssl: {{ssl}}
{{/if}}
{{#if http_client_timeout}}
resource.timeout: {{http_client_timeout}}
{{/if}}
resource.url: <constructed from vars>
state:
  <credentials and pagination config from vars>
redact:
  fields:
    - <sensitive state keys>
max_executions: <number, for heavy pagination>
program: |
  <CEL expression>
tags:
{{#if preserve_original_event}}
  - preserve_original_event
{{/if}}
{{#each tags as |tag|}}
  - {{tag}}
{{/each}}
{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}
{{#if processors}}
processors:
{{processors}}
{{/if}}
位于
data_stream/<stream>/agent/stream/cel.yml.hbs
cel.yml.hbs
文件是一个Handlebars模板,用于渲染最终的CEL输入配置。它按顺序包含以下部分:
yaml
interval: {{interval}}
resource.tracer:
  enabled: {{enable_request_tracer}}
  filename: "../../logs/cel/http-request-trace-*.ndjson"
  maxbackups: 5
{{#if proxy_url}}
resource.proxy_url: {{proxy_url}}
{{/if}}
{{#if ssl}}
resource.ssl: {{ssl}}
{{/if}}
{{#if http_client_timeout}}
resource.timeout: {{http_client_timeout}}
{{/if}}
resource.url: <constructed from vars>
state:
  <credentials and pagination config from vars>
redact:
  fields:
    - <sensitive state keys>
max_executions: <number, for heavy pagination>
program: |
  <CEL expression>
tags:
{{#if preserve_original_event}}
  - preserve_original_event
{{/if}}
{{#each tags as |tag|}}
  - {{tag}}
{{/each}}
{{#contains "forwarded" tags}}
publisher_pipeline.disable_host: true
{{/contains}}
{{#if processors}}
processors:
{{processors}}
{{/if}}

Handlebars patterns

Handlebars模式

PatternPurpose
{{var_name}}
Direct variable substitution
{{#if var_name}}...{{/if}}
Conditional block for optional config
{{#each tags as |tag|}}
Iteration over list vars
{{#contains "forwarded" tags}}
Check if list contains value
模式用途
{{var_name}}
直接变量替换
{{#if var_name}}...{{/if}}
可选配置的条件块
{{#each tags as |tag|}}
遍历列表变量
{{#contains "forwarded" tags}}
检查列表是否包含指定值

Key template fields

关键模板字段

  • resource.url
    — base URL, often constructed from multiple vars (e.g.,
    {{url}}/api/v1/endpoint
    )
  • resource.headers
    (ga 8.18.1) — static headers the same for every request (
    Content-Type
    ,
    Accept
    , API version headers). Set here rather than in-program when headers never vary. Applied before auth headers.
  • state:
    — block where manifest vars are injected as CEL state; credentials and pagination settings go here
  • redact.fields
    — list state keys containing secrets to redact from debug logs
  • max_executions
    — override default 1000 for integrations with heavy pagination (e.g., 5000)
  • program: |
    — the CEL expression; must be a YAML literal block scalar
  • resource.url
    — 基础URL,通常由多个变量构造(例如
    {{url}}/api/v1/endpoint
  • resource.headers
    (ga 8.18.1) — 每个请求都相同的静态头(
    Content-Type
    Accept
    、API版本头)。当头从不变化时,在此处设置而非在程序中设置。在认证头之前应用。
  • state:
    — 将manifest变量注入为CEL状态的块;凭证和分页设置放在此处
  • redact.fields
    — 包含敏感信息的状态键列表,将从调试日志中脱敏
  • max_executions
    — 为具有大量分页的集成覆盖默认值1000(例如5000)
  • program: |
    — CEL表达式;必须是YAML字面块标量

Do NOT set
data_stream.dataset
in integration packages

请勿在集成包中设置
data_stream.dataset

Integration packages (
type: integration
) must never include
data_stream.dataset
in
cel.yml.hbs
or define a
data_stream.dataset
manifest var. The framework automatically routes documents to the correct data stream. Setting
data_stream.dataset
overrides this routing and causes documents to land in the wrong index — typically resulting in "0 hits" during system tests.
Only input-type packages (
type: input
) use
data_stream.dataset
because they have no predefined data streams.
集成包
type: integration
)必须绝不
cel.yml.hbs
中包含
data_stream.dataset
,也不得定义
data_stream.dataset
manifest变量。框架会自动将文档路由到正确的数据流。设置
data_stream.dataset
会覆盖此路由,导致文档进入错误的索引——通常会在系统测试中导致“0命中”。
只有输入类型包
type: input
)使用
data_stream.dataset
,因为它们没有预定义的数据流。

Data stream manifest configuration

数据流manifest配置

The data stream
manifest.yml
defines the CEL input stream and its variables.
数据流
manifest.yml
定义了CEL输入流及其变量。

Standard vars every CEL stream should include

每个CEL流应包含的标准变量

VarTypePurpose
url
textAPI base URL
interval
textPolling interval (e.g.,
5m
)
initial_interval
textLookback window on first run (e.g.,
24h
)
enable_request_tracer
boolEnable HTTP request tracing
http_client_timeout
textRequest timeout (e.g.,
30s
)
proxy_url
textHTTP proxy URL
ssl
yamlTLS configuration
tags
text (multi)Event tags
preserve_original_event
boolKeep original event
processors
yamlBeat processors
Auth-specific vars depend on the API (API key, OAuth client_id/secret/token_url, bearer token, etc.).
Declare
enable_request_tracer
in the data stream manifest, not at the input level. Input-level tracing enables logging for all data streams in the policy.
变量类型用途
url
文本API基础URL
interval
文本轮询间隔(例如
5m
initial_interval
文本首次运行的回溯窗口(例如
24h
enable_request_tracer
布尔值启用HTTP请求追踪
http_client_timeout
文本请求超时(例如
30s
proxy_url
文本HTTP代理URL
ssl
yamlTLS配置
tags
文本(多值)事件标签
preserve_original_event
布尔值保留原始事件
processors
yamlBeat处理器
认证相关变量取决于API(API密钥、OAuth client_id/secret/token_url、Bearer令牌等)。
在数据流manifest中声明
enable_request_tracer
,而非在输入级别。输入级别追踪会启用策略中所有数据流的日志记录。

Package-level vs data-stream-level vars

包级别与数据流级别变量

  • Package-level vars in the root
    manifest.yml
    under
    policy_templates[].inputs[].vars
    : shared across streams (e.g.,
    url
    , auth credentials)
  • Data-stream-level vars in
    data_stream/<stream>/manifest.yml
    under
    streams[].vars
    : stream-specific (e.g.,
    interval
    ,
    batch_size
    ,
    initial_interval
    )
  • 包级别变量:位于根
    manifest.yml
    policy_templates[].inputs[].vars
    下;在所有流之间共享(例如
    url
    、认证凭证)
  • 数据流级别变量:位于
    data_stream/<stream>/manifest.yml
    streams[].vars
    下;特定于流(例如
    interval
    batch_size
    initial_interval

Scope of the CEL program

CEL程序的职责范围

The CEL program's responsibility is data collection only:
  1. Fetch data from the API endpoint(s)
  2. Handle pagination — walk through all pages within a single polling cycle
  3. Manage cursor state — store timestamps or page tokens in
    cursor
    so the next polling interval resumes where the last one left off, avoiding re-collection of already-fetched events
  4. Emit raw events — output
    {"message": e.encode_json()}
    for each record
The CEL program does not handle:
  • Elasticsearch-level deduplication — if overlapping time windows cause a few duplicate events to be collected, that is acceptable. The ingest pipeline or Elasticsearch
    _id
    routing handles dedup at index time, not the CEL program.
  • Field mapping or transformation — the ingest pipeline handles parsing, ECS mapping, and enrichment.
  • Filtering by content — unless the API supports server-side filtering parameters, do not filter events in the CEL program. Emit everything and let the pipeline decide.
Do not search the codebase for
_id
,
document_id
, or deduplication patterns. These are not CEL concerns.
CEL程序的职责仅为数据收集
  1. 从API端点获取数据
  2. 处理分页 — 在单个轮询周期内遍历所有页面
  3. 管理游标状态 — 在
    cursor
    中存储时间戳或页面令牌,以便下一个轮询间隔从上次结束的位置继续,避免重新收集已获取的事件
  4. 输出原始事件 — 为每条记录输出
    {"message": e.encode_json()}
CEL程序处理:
  • Elasticsearch级别去重 — 如果重叠时间窗口导致收集到少量重复事件,这是可接受的。摄入管道或Elasticsearch的
    _id
    路由会在索引时处理去重,而非CEL程序。
  • 字段映射或转换 — 摄入管道处理解析、ECS映射和 enrichment。
  • 按内容过滤 — 除非API支持服务器端过滤参数,否则请勿在CEL程序中过滤事件。输出所有内容,让管道决定。
请勿在代码库中搜索
_id
document_id
或去重模式。这些不属于CEL的职责范围。

CEL program structure patterns

CEL程序结构模式

Pagination strategy selection

分页策略选择

API behaviorPatternKey indicators
Returns total count + supports offsetOffset pagination
total_count
,
offset
,
limit
in request/response
Returns records since a timestampTimestamp cursorTime-range params, no explicit page tokens
Returns
Link
header with next URL
Link header
Link: <url>; rel="next"
in response headers
Returns next-page URL in response bodyNext-URL
next
,
nextLink
,
@odata.nextLink
field in JSON
GraphQL with
pageInfo
GraphQL cursor
hasNextPage
,
endCursor
in
pageInfo
object
Multi-phase subscription/content flowMulti-step state machineMultiple API calls with work queues in state
Cursor timestamp selection — use the last record's timestamp when the API sorts ascending; first when descending;
max()
with a regression guard when sort order is not guaranteed.
Full code, package references, and YAML snippets for each pattern:
references/cel-pagination-patterns.md
.
API行为模式关键指标
返回总数 + 支持偏移量偏移量分页请求/响应中的
total_count
offset
limit
返回时间戳之后的记录时间戳游标时间范围参数,无显式页面令牌
返回带有下一页URL的
Link
Link头响应头中的
Link: <url>; rel="next"
在响应体中返回下一页URL下一页URLJSON中的
next
nextLink
@odata.nextLink
字段
带有
pageInfo
的GraphQL
GraphQL游标
pageInfo
对象中的
hasNextPage
endCursor
多阶段订阅/内容流多步骤状态机状态中包含工作队列的多个API调用
游标时间戳选择 — 当API按升序排序时,使用最后一条记录的时间戳;按降序排序时使用第一条;当排序顺序不保证时,使用
max()
并添加回归防护。
每个模式的完整代码、包引用和YAML片段:
references/cel-pagination-patterns.md

Authentication patterns

认证模式

Three strategies: header (credentials in
state:
, passed via
Header
map), query parameter (credentials appended to URL via
.format_query()
), signed query (HMAC signature computed in CEL). Config-level
auth.oauth2
/
auth.digest
/
auth.aws
applies to all requests including
.do_request()
;
auth.basic
/
auth.token
applies only to direct calls (
get()
,
post()
). Prefer input-level auth over in-program token fetching.
For full code examples, optional-header syntax, and config-level auth scope details: load
references/cel-auth-patterns.md
.
三种策略:头认证(凭证在
state:
中,通过
Header
映射传递)、查询参数认证(凭证通过
.format_query()
附加到URL)、签名查询认证(在CEL中计算HMAC签名)。配置级别的
auth.oauth2
/
auth.digest
/
auth.aws
适用于所有请求,包括
.do_request()
auth.basic
/
auth.token
仅适用于直接调用(
get()
post()
)。优先使用输入级认证而非程序内令牌获取。
有关完整代码示例、可选头语法和配置级认证范围细节:加载
references/cel-auth-patterns.md

State management rules

状态管理规则

  1. state.url
    is populated from
    resource.url
    config; must be preserved in output or hardcoded
  2. cursor
    is the only state persisted across input restarts; store pagination positions and timestamps here
  3. events
    array is removed after each evaluation; never rely on it in subsequent runs
  4. want_more: true
    triggers immediate re-evaluation, but only if
    events
    is non-empty. Pagination continuation guardrail: when a next-page cursor/token exists, always set
    want_more: true
    regardless of how many events were collected on the current page. Tying
    want_more
    to
    size(events) > 0
    stalls pagination silently — the next cursor is valid, and an empty
    events
    array is safe to emit. The correct pattern is
    "want_more": next_cursor != ""
    .
  5. All other state keys are retained within a session but lost on restart — use
    state.with()
    to propagate them automatically
  6. Numbers are serialized as floats in state JSON; cast with
    int()
    when using as integers
  7. Optional access with
    state.?cursor.last_timestamp.orValue(default)
    prevents errors when cursor is absent
  8. Secrets — every sensitive field in
    state
    must have a corresponding
    redact
    entry.
    state.secret
    is always redacted automatically. When
    secret_state
    (ga 9.4.0) is available, prefer it.
  9. Cursor updates require a published event — the input only persists cursor updates when at least one event is published. If a program updates the cursor but returns zero events, the cursor change is lost.
  10. Do not duplicate request/response handling across branches — when an initialization branch (cursor creation, subscription, token exchange) and a steady-state branch both need the same fetch logic, consolidate it. Two approaches: split the init into a separate evaluation via
    want_more: true
    (Technique 6 Variant A), or use an intermediate result map to unify the branches within one evaluation (Variant B). Both are valid — see
    references/cel-code-style.md
    Technique 6 and the init-then-steady-state pattern in
    references/cel-pagination-patterns.md
    .
  11. Nesting depth
    .as()
    chain depth must not exceed 5 levels on any execution path. HTTP programs must target 2 levels inside
    state.with()
    (
    resp
    +
    body
    ). Cursor defaults, window bounds, and page tokens must be extracted as pre-bindings before
    state.with()
    . Single-use values such as
    int(state.batch_size)
    must be inlined at the call site, not wrapped in
    .as()
    . Load
    references/cel-code-style.md
    for flattening techniques and before/after examples.
  1. state.url
    resource.url
    配置填充;必须在输出中保留或硬编码
  2. cursor
    是输入重启后唯一持久化的状态;在此存储分页位置和时间戳
  3. events
    数组在每次评估后被移除;切勿在后续运行中依赖它
  4. want_more: true
    触发立即重新评估,但仅当
    events
    非空时。分页继续防护: 当存在下一页游标/令牌时,无论当前页收集了多少事件,始终设置
    want_more: true
    。将
    want_more
    size(events) > 0
    关联会导致分页静默停滞——下一个游标是有效的,空的
    events
    数组可以安全输出。正确的模式是
    "want_more": next_cursor != ""
  5. 所有其他状态键 在会话内保留,但重启后丢失——使用
    state.with()
    自动传播它们
  6. 数字 在状态JSON中序列化为浮点数;当用作整数时,使用
    int()
    转换
  7. 可选访问 使用
    state.?cursor.last_timestamp.orValue(default)
    可防止游标不存在时出错
  8. 敏感信息
    state
    中的每个敏感字段都必须有对应的
    redact
    条目。
    state.secret
    始终自动脱敏。当
    secret_state
    (ga 9.4.0)可用时,优先使用它。
  9. 游标更新需要发布事件 — 只有当至少发布一个事件时,输入才会持久化游标更新。如果程序更新了游标但返回零事件,游标更改将丢失。
  10. 请勿在分支间重复请求/响应处理 — 当初始化分支(游标创建、订阅、令牌交换)和稳态分支都需要相同的获取逻辑时,将其合并。两种方法:通过
    want_more: true
    将初始化拆分为单独的评估(技术6变体A),或使用中间结果映射在一次评估内统一分支(变体B)。两种方法均有效——请参阅
    references/cel-code-style.md
    中的技术6和
    references/cel-pagination-patterns.md
    中的初始化后稳态模式。
  11. 嵌套深度 — 在任何执行路径上,
    .as()
    链的深度不得超过5级。HTTP程序必须在
    state.with()
    内控制在2级(
    resp
    +
    body
    )。游标默认值、窗口边界和页面令牌必须在
    state.with()
    之前提取为预绑定。单次使用的值(如
    int(state.batch_size)
    )必须内联在调用点,而非包装在
    .as()
    中。加载
    references/cel-code-style.md
    以获取扁平化技术和前后示例。

Map merge and field removal

映射合并与字段移除

with()
,
with_replace()
,
with_update()
, and
drop()
are general-purpose map operations — they work on any map, not just state or cursor.
with()
does a shallow merge: nested objects are replaced entirely. This makes it a tool for cursor state transitions (omitting a sub-object removes it via clobber) as well as for building request headers, transforming response data, and constructing intermediate maps. Full semantics and examples:
references/cel-code-style.md
.
with()
with_replace()
with_update()
drop()
是通用映射操作——它们适用于任何映射,不仅是state或cursor。
with()
执行浅合并:嵌套对象被完全替换。这使其成为游标状态转换(省略子对象通过覆盖将其移除)以及构建请求头、转换响应数据和构造中间映射的工具。完整语义和示例:
references/cel-code-style.md

Event output format

事件输出格式

Events must contain ONLY
"message"
{"message": e.encode_json()}
. Do not set
@timestamp
or any other field; the framework adds
@timestamp
, and duplicates cause silent document rejection in ES 9.x. See
references/cel-idioms.md
for correct/incorrect examples.
事件必须仅包含
"message"
{"message": e.encode_json()}
。请勿设置
@timestamp
或任何其他字段;框架会添加
@timestamp
,重复设置会导致ES 9.x中静默拒绝文档。有关正确/错误示例,请参阅
references/cel-idioms.md

Response handling

响应处理

resp.Body.decode_json()
— the
bytes(resp.Body)
wrapper was required in older runtime versions but is no longer needed. Use
resp.Body.decode_json()
directly.
resp.Body.decode_json()
— 旧版本运行时需要
bytes(resp.Body)
包装器,但现在不再需要。直接使用
resp.Body.decode_json()

Error handling

错误处理

Every CEL program must handle HTTP errors. Two forms:
  • Single-object error (retry):
    "events": {"error": {...}}
    — logs at ERROR, sets degraded status, deletes the cursor so the next evaluation retries. Use when data was not collected.
  • Array error (advance):
    "events": [{"error": {...}}]
    — cursor is updated. Use when the program should advance past the error. Requires a
    terminate
    processor in the ingest pipeline (ES 8.16.0+).
Error message format:
"METHOD path: body-or-status"
. Code examples:
references/cel-idioms.md
.
每个CEL程序都必须处理HTTP错误。两种形式:
  • 单对象错误(重试):
    "events": {"error": {...}}
    — 在ERROR级别记录日志,设置降级状态,删除游标以便下一次评估重试。当未收集到数据时使用。
  • 数组错误(推进):
    "events": [{"error": {...}}]
    — 游标被更新。当程序应跳过错误继续时使用。需要摄入管道中的
    terminate
    处理器(ES 8.16.0+)。
错误消息格式:
"METHOD path: body-or-status"
。代码示例:
references/cel-idioms.md

Placeholder events

占位符事件

When advancing the cursor with no real events, emit a placeholder (
[{"retry": true}]
) and add a
- drop_event.when.equals.retry: true
entry in the
processors:
section so it is discarded before indexing. Full pattern and alternatives:
references/cel-idioms.md
.
当没有真实事件但需要推进游标时,输出占位符(
[{"retry": true}]
)并在
processors:
部分添加
- drop_event.when.equals.retry: true
条目,以便在索引前将其丢弃。完整模式和替代方案:
references/cel-idioms.md

Rate limiting and retry

速率限制与重试

Do NOT implement rate limiting or retry logic in the CEL program. No
rate_limit()
calls, no
"rate_limit"
state propagation, no 429-specific branches, no retry loops. These add excessive nesting and complexity for marginal benefit.
When an API has a documented rate limit, use config-only YAML settings in
cel.yml.hbs
:
yaml
resource.rate_limit.limit: 10   # max requests per second
resource.rate_limit.burst: 5    # max burst above sustained rate
When custom retry behavior is needed, use config-only YAML settings:
yaml
resource.retry.max_attempts: 5    # default: 5
resource.retry.wait_min: 1s       # default: 1s
resource.retry.wait_max: 60s      # default: 60s
The input framework enforces both transparently. See
references/cel-rate-limiting.md
for guidance on when to add these settings.
请勿在CEL程序中实现速率限制或重试逻辑。 不要调用
rate_limit()
,不要传播
"rate_limit"
状态,不要处理429特定分支,不要使用重试循环。这些会增加过多的嵌套和复杂性,而收益甚微。
当API有文档化的速率限制时,在
cel.yml.hbs
中使用仅配置的YAML设置:
yaml
resource.rate_limit.limit: 10   # 每秒最大请求数
resource.rate_limit.burst: 5    # 超过持续速率的最大突发请求数
当需要自定义重试行为时,使用仅配置的YAML设置:
yaml
resource.retry.max_attempts: 5    # 默认值:5
resource.retry.wait_min: 1s       # 默认值:1s
resource.retry.wait_max: 60s      # 默认值:60s
输入框架会透明地执行这些设置。有关何时添加这些设置的指南,请参阅
references/cel-rate-limiting.md

Type safety

类型安全

Avoid
dyn()
— defeats type checking. Rarely needed in practice.
All numbers are float64 — the CEL input transmits all numbers as
float64
. Numbers >=1e7 render in scientific notation in Elasticsearch. Convert intended-integer fields to strings in the CEL program or via ingest pipeline. Safe integer range: [-(2^53 - 1), 2^53 - 1].
避免使用
dyn()
— 会破坏类型检查。实际上很少需要。
所有数字都是float64 — CEL输入将所有数字作为
float64
传输。大于等于1e7的数字在Elasticsearch中会以科学计数法显示。在CEL程序或摄入管道中将预期为整数的字段转换为字符串。安全整数范围:[-(2^53 - 1), 2^53 - 1]。

Debugging aids

调试工具

  • debug(tag, value)
    — logs to
    cel_debug
    at DEBUG level.
  • try(expr)
    /
    is_error(value)
    — structured error handling without crashing the program.
  • failure_dump
    (ga 8.18.0) — full evaluation state dump on failure. Note: dumps may contain secrets.
  • remaining_executions
    (ga 9.2) — how many evaluations remain in the
    max_executions
    budget.

  • debug(tag, value)
    — 在DEBUG级别记录到
    cel_debug
  • try(expr)
    /
    is_error(value)
    — 结构化错误处理,不会导致程序崩溃。
  • failure_dump
    (ga 8.18.0) — 失败时转储完整评估状态。注意:转储可能包含敏感信息。
  • remaining_executions
    (ga 9.2) —
    max_executions
    预算中剩余的评估次数。

Mito CLI

Mito CLI

Mito (
github.com/elastic/mito
) is the local CEL evaluation CLI. A CEL program that has not been tested with mito is not acceptable. Follow the mandatory workflow at the top of this skill: mock → mito → template. Do NOT write
cel.yml.hbs
until the program passes mito validation.
For installation, CLI flags, input state structure, execution model, the full mock-first workflow steps, mito→integration mapping, and quality standards: load
references/mito-reference.md
.

Mito(
github.com/elastic/mito
)是本地CEL评估CLI。未通过mito测试的CEL程序是不可接受的。 请遵循本技能顶部的强制工作流:模拟 → mito → 模板。在程序通过mito验证之前,请勿编写
cel.yml.hbs
有关安装、CLI标志、输入状态结构、执行模型、完整的先模拟后开发工作流步骤、mito到集成的映射以及质量标准:加载
references/mito-reference.md

Data anonymization

数据匿名化

All data committed to the repository must be fully anonymized. This applies to default values in manifest vars (use
https://api.example.com
), example values in CEL state, mock API responses, pipeline test fixtures from CEL output, and sample payloads captured during mito prototyping.
Refer to the
anonymize-logs
skill for the full anonymization policy and placeholder conventions.
提交到仓库的所有数据必须完全匿名化。 这适用于manifest变量中的默认值(使用
https://api.example.com
)、CEL状态中的示例值、模拟API响应、CEL输出的管道测试 fixture,以及mito原型制作期间捕获的示例负载。
有关完整的匿名化策略和占位符约定,请参阅
anonymize-logs
技能。

Handoff to other skills

移交到其他技能

  • integration-testing
    references/system-testing.md
    to run system tests (the mock API is already in place from CEL development)
  • integration-testing
    references/pipeline-testing.md
    to validate ingest pipeline behavior on CEL-produced events
  • create-integration
    skill for overall package layout
  • integration-testing
    references/system-testing.md
    运行系统测试(CEL开发阶段已准备好模拟API)
  • integration-testing
    references/pipeline-testing.md
    验证摄入管道对CEL生成事件的处理行为
  • create-integration
    技能用于整体包布局

Reference files

参考文件

IMPORTANT: These reference files contain the actual working code examples and patterns. The summaries above are not sufficient to write correct CEL programs — you MUST load the relevant references before writing code.
Always load these five when building a CEL program — in this order (mock/mito before templates):
FileContainsLoad order
references/cel-system-tests.md
Mock API setup with elastic/stream, docker-compose config, rule format, variable-capture patterns, GraphQL mock examples, hit_count calculation, and debugging 0-hits failures1st — you need this before writing any CEL
references/cel-incremental-build.md
Mandatory phased build ladder (skeleton → error handling → events → pagination → cursor), syntax anti-patterns that cause compilation failures (
bytes()
,
parse_time()
, tuples, unbalanced parens), and debugging guidance
2nd — you MUST follow this phased approach; do not write the full program before validating a skeleton
references/mito-reference.md
Mito CLI flags, input state structure, mock-first workflow, translating template vars to state.json, extension library quick-reference, syntax pitfalls, testscript harness3rd — you need this to develop and validate the program
references/cel-template-examples.md
Complete working
cel.yml.hbs
examples (minimal GET, paginated timestamp cursor, OAuth, GraphQL cursor) with corresponding manifest configs — these are FINAL output; do not write templates until mito passes
4th — only needed at step 5 of the workflow
references/cel-code-style.md
Nesting discipline: the 3-level HTTP core rule, six flattening/structuring techniques (including intermediate result maps for shared logic), shallow merge semantics, cursor namespacing with clobber, merge strategies (
with
/
with_replace
/
with_update
),
drop()
, and links to well-structured reference integrations — must read before writing any multi-line CEL
5th — read this before writing your CEL program so structure is right from the start
Load these based on the task:
FileLoad when
references/cel-pagination-patterns.md
Writing any pagination logic — all 6 patterns with code
references/cel-auth-patterns.md
Implementing authentication — header, query param, signed, and config-level auth patterns
references/cel-rate-limiting.md
Rate limiting policy — config-only approach, when to add
resource.rate_limit.*
and
resource.retry.*
settings
references/cel-idioms.md
Quick-reference for common idioms, HTTP request patterns, structure conventions
references/cel-polymorphic-patterns.md
Choosing between pure-CEL, mito lib, and config approaches for auth, headers, rate limiting — version-tagged
references/cel-expression.md
Expression-specific reference: interface contract, translation framing (Python→CEL), incremental build phases, core structure, event output, error handling, pagination, state management, syntax rules, quality checklist
references/cel-taxonomy.md
Taxonomy classification: pagination and state management classes, least-complexity principle, mapping to skill vocabulary, how to classify from test-api.py
references/cel-complexity-baselines.md
Per-pattern-class complexity baselines from a ceplx survey of 316 programs, skip threshold, reviewer challenge examples, diagnostic interpretation
references/expression-builder-subagent-guidance.md
Subagent operating manual for the cel-expression-builder: translates test-api.py into a validated
.cel
file + taxonomy classification. Does not touch templates, manifests, or mocks.
references/reviewer-subagent-guidance.md
Subagent operating manual for the cel-expression-reviewer: checks generated CEL against complexity baselines and source fidelity, produces specific challenges or accepts
references/cel-function-reference.md
Looking up available CEL functions per extension and their first mito version
references/builder-subagent-guidance.md
Subagent operating manual for the cel-program-builder orchestrator: scope boundaries, skill-load sequence, the 9-step mock-first / mito-incremental workflow with mock completeness gate, delegation to cel-expression-builder, reporting contract. The orchestrator dispatches subagents by passing this file's path in the task prompt; the subagent reads it itself in its own fresh context. Do NOT embed/paste its contents into the task prompt.
重要提示:这些参考文件包含实际可运行的代码示例和模式。上面的摘要不足以编写正确的CEL程序——在编写代码之前,必须加载相关参考文件。
构建CEL程序时,请始终按以下顺序加载这五个文件(先模拟/mito,后模板):
文件包含内容加载顺序
references/cel-system-tests.md
使用elastic/stream设置模拟API、docker-compose配置、规则格式、变量捕获模式、GraphQL模拟示例、hit_count计算,以及调试0命中故障第1个 — 在编写任何CEL之前需要此文件
references/cel-incremental-build.md
强制分阶段构建步骤(框架 → 错误处理 → 事件 → 分页 → 游标)、导致编译失败的语法反模式(
bytes()
parse_time()
、元组、不平衡括号),以及调试指南
第2个 — 必须遵循此分阶段方法;在验证框架之前请勿编写完整程序
references/mito-reference.md
Mito CLI标志、输入状态结构、先模拟后开发工作流、将模板变量转换为state.json、扩展库快速参考、语法陷阱、testscript工具第3个 — 需要此文件来开发和验证程序
references/cel-template-examples.md
完整可运行的
cel.yml.hbs
示例(最小GET、带时间戳游标的分页、OAuth、GraphQL游标)以及对应的manifest配置 — 这些是最终输出;在mito通过之前请勿编写模板
第4个 — 仅在工作流的步骤5中需要
references/cel-code-style.md
嵌套规范:3级HTTP核心规则、六种扁平化/结构化技术(包括用于共享逻辑的中间结果映射)、浅合并语义、使用覆盖的游标命名空间、合并策略(
with
/
with_replace
/
with_update
)、
drop()
,以及结构良好的参考集成链接 — 在编写任何多行CEL之前必须阅读
第5个 — 在编写CEL程序之前阅读此文件,确保从一开始结构就正确
根据任务加载以下文件:
文件加载时机
references/cel-pagination-patterns.md
编写任何分页逻辑时 — 包含所有6种模式的代码
references/cel-auth-patterns.md
实现认证时 — 头、查询参数、签名和配置级认证模式
references/cel-rate-limiting.md
速率限制策略 — 仅配置方法、何时添加
resource.rate_limit.*
resource.retry.*
设置
references/cel-idioms.md
常见习语、HTTP请求模式、结构约定的快速参考
references/cel-polymorphic-patterns.md
为认证、头、速率限制选择纯CEL、mito库和配置方法 — 带版本标记
references/cel-expression.md
表达式特定参考:接口契约、翻译框架(Python→CEL)、分阶段构建步骤、核心结构、事件输出、错误处理、分页、状态管理、语法规则、质量检查表
references/cel-taxonomy.md
分类学分类:分页和状态管理类别、最低复杂度原则、与技能词汇的映射、如何通过test-api.py分类
references/cel-complexity-baselines.md
基于316个程序的ceplx调查得出的每个模式类别的复杂度基线、跳过阈值、评审挑战示例、诊断解释
references/expression-builder-subagent-guidance.md
cel-expression-builder子代理操作手册:将test-api.py转换为已验证的
.cel
文件 + 分类学分类。不涉及模板、manifest或模拟。
references/reviewer-subagent-guidance.md
cel-expression-reviewer子代理操作手册:检查生成的CEL是否符合复杂度基线和源保真度,生成特定挑战或接受
references/cel-function-reference.md
查找每个扩展可用的CEL函数及其首个mito版本
references/builder-subagent-guidance.md
cel-program-builder编排器子代理操作手册:范围边界、技能加载序列、带模拟完整性门的9步先模拟后开发/mito分阶段工作流、委托给cel-expression-builder、报告契约。编排器通过在任务提示中传递此文件的路径来调度子代理;子代理在自己的新上下文中自行读取它。请勿将其内容嵌入/粘贴到任务提示中。