ecs-field-mappings

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

ecs-field-mappings

ecs-field-mappings

When to use

使用场景

Use this skill when tasks include:
  • adding or modifying files under
    data_stream/<stream>/fields/
  • populating
    ecs.yml
    with ECS field references
  • selecting
    event.kind
    ,
    event.category
    ,
    event.type
    , and
    event.outcome
    values
  • choosing field
    type
    and mapping properties (
    metric_type
    ,
    dimension
    ,
    multi_fields
    , and related options)
  • checking whether a field already exists in ECS before adding custom fields
  • troubleshooting mapping validation/build failures from
    elastic-package check
    ,
    elastic-package lint
    , or pipeline test schema checks
当您需要完成以下任务时,请使用本技能:
  • 添加或修改
    data_stream/<stream>/fields/
    目录下的文件
  • 使用ECS字段引用填充
    ecs.yml
  • 选择
    event.kind
    event.category
    event.type
    event.outcome
    的值
  • 选择字段
    type
    和映射属性(
    metric_type
    dimension
    multi_fields
    及相关选项)
  • 添加自定义字段前检查该字段是否已存在于ECS中
  • 排查
    elastic-package check
    elastic-package lint
    或管道测试 schema 检查导致的映射验证/构建失败

ECS dependency configuration

ECS依赖配置

Every package needs
_dev/build/build.yml
at the package root. This file pins the ECS schema version used for field resolution.
yaml
dependencies:
  ecs:
    reference: "git@v9.3.0"
This file is required whenever the package has any field file. The scaffold does not generate it — create it manually. If it is missing or uses an outdated version, tests report ECS fields as undefined (e.g.,
field "destination.ip" is undefined
).
每个包的根目录下都需要
_dev/build/build.yml
文件,该文件用于固定字段解析所使用的ECS schema版本。
yaml
dependencies:
  ecs:
    reference: "git@v9.3.0"
只要包包含任何字段文件,该文件就是必填项。脚手架不会自动生成它——需要手动创建。如果缺失或使用过时版本,测试会报告ECS字段未定义(例如:
field "destination.ip" is undefined
)。

Field files and roles

字段文件与作用

A data stream's
fields/
directory contains a small set of YAML files with distinct responsibilities:
数据流的
fields/
目录包含一组职责明确的YAML文件:

base-fields.yml

base-fields.yml

Fixed routing constants and
@timestamp
. All six fields are ECS fields, so each entry uses
external: ecs
. Override
type
and
value
where the data stream needs a
constant_keyword
with a fixed value — the description is inherited from ECS automatically.
yaml
- name: data_stream.type
  external: ecs
- name: data_stream.dataset
  external: ecs
- name: data_stream.namespace
  external: ecs
- name: event.module
  external: ecs
  type: constant_keyword
  value: <package_name>
- name: event.dataset
  external: ecs
  type: constant_keyword
  value: <package_name>.<stream_name>
- name: '@timestamp'
  external: ecs
Do not add other fields here. Only these routing constants and
@timestamp
belong in
base-fields.yml
.
包含固定路由常量和
@timestamp
。这六个字段均为ECS字段,因此每个条目都使用
external: ecs
。当数据流需要带固定值的
constant_keyword
时,可覆盖
type
value
——描述会自动从ECS继承。
yaml
- name: data_stream.type
  external: ecs
- name: data_stream.dataset
  external: ecs
- name: data_stream.namespace
  external: ecs
- name: event.module
  external: ecs
  type: constant_keyword
  value: <package_name>
- name: event.dataset
  external: ecs
  type: constant_keyword
  value: <package_name>.<stream_name>
- name: '@timestamp'
  external: ecs
请勿在此添加其他字段。只有这些路由常量和
@timestamp
属于
base-fields.yml

constant_keyword
candidates

constant_keyword
候选字段

Fields that hold a single value for every document in a data stream should use
constant_keyword
. Beyond the routing constants in
base-fields.yml
, evaluate these:
FieldWhy
constant_keyword
event.dataset
One value per data stream by definition
event.module
One value per package
data_stream.type
Fixed per stream (
logs
/
metrics
)
data_stream.dataset
Fixed per stream
data_stream.namespace
Set at deployment, constant within index
observer.vendor
Package represents one vendor
observer.product
Package represents one product
When a
constant_keyword
field is also an ECS field (e.g.,
observer.vendor
), use
external: ecs
with the type override. This inherits the description from ECS and avoids manual duplication. Place the definition in the appropriate field file (
ecs.yml
for most ECS fields,
base-fields.yml
for routing constants):
yaml
- name: observer.vendor
  external: ecs
  type: constant_keyword
  value: Acme Corp
remove_from_source
option:
Because
constant_keyword
stores the value once in index metadata, it does not need to appear in every document's
_source
. Elasticsearch handles this automatically — no explicit
_source.excludes
configuration is needed. This saves storage when the value is always the same.
对于数据流中每个文档都持有单一值的字段,应使用
constant_keyword
。除了
base-fields.yml
中的路由常量外,还需评估以下字段:
字段使用
constant_keyword
的原因
event.dataset
根据定义,每个数据流对应一个值
event.module
每个包对应一个值
data_stream.type
每个流固定为
logs
/
metrics
data_stream.dataset
每个流固定不变
data_stream.namespace
部署时设置,在索引内保持恒定
observer.vendor
每个包代表一个厂商
observer.product
每个包代表一个产品
constant_keyword
字段同时也是ECS字段时(例如
observer.vendor
),请使用
external: ecs
并覆盖类型。这会从ECS继承描述,避免手动重复定义。将定义放在合适的字段文件中(大多数ECS字段放在
ecs.yml
,路由常量放在
base-fields.yml
):
yaml
- name: observer.vendor
  external: ecs
  type: constant_keyword
  value: Acme Corp
remove_from_source
选项:
由于
constant_keyword
会将值存储在索引元数据中一次,因此无需出现在每个文档的
_source
中。Elasticsearch会自动处理此操作——无需显式配置
_source.excludes
。当值始终相同时,这可以节省存储空间。

ecs.yml

ecs.yml

Populate this file with every ECS field the pipeline sets. Use only
name
and
external: ecs
for each entry — no type, no description. The type is resolved from the ECS schema via
_dev/build/build.yml
.
external: ecs
must be used whenever a field name exists in ECS (wiki reference). This applies across field files —
ecs.yml
,
base-fields.yml
, and any file that defines an ECS field. You may override properties (e.g.,
type: constant_keyword
,
value:
) while still using
external: ecs
— the description is inherited from ECS. Do not use
external: ecs
in
fields.yml
,
agent.yml
, or
beats.yml
— those files define non-ECS fields.
yaml
- name: event.kind
  external: ecs
- name: event.category
  external: ecs
- name: event.type
  external: ecs
- name: event.outcome
  external: ecs
- name: event.action
  external: ecs
- name: source.ip
  external: ecs
- name: source.port
  external: ecs
- name: destination.ip
  external: ecs
- name: user.name
  external: ecs
- name: related.ip
  external: ecs
- name: related.user
  external: ecs
When attaching extra metadata to an ECS field (for example making a field a TSDB dimension or a constant_keyword with a fixed value), combine
external: ecs
with that metadata. The description is inherited from ECS. Place the definition in
ecs.yml
(or
base-fields.yml
for routing constants):
yaml
- name: observer.vendor
  external: ecs
  type: constant_keyword
  value: Acme Corp
请在此文件中填入管道设置的所有ECS字段。 每个条目仅使用
name
external: ecs
——无需指定类型和描述。类型会通过
_dev/build/build.yml
从ECS schema解析得到。
只要字段名称存在于ECS中,就必须使用
external: ecs
维基参考)。这适用于所有字段文件——
ecs.yml
base-fields.yml
以及任何定义ECS字段的文件。您可以在使用
external: ecs
的同时覆盖属性(例如
type: constant_keyword
value:
)——描述会从ECS继承。请勿在
fields.yml
agent.yml
beats.yml
中使用
external: ecs
——这些文件用于定义非ECS字段。
yaml
- name: event.kind
  external: ecs
- name: event.category
  external: ecs
- name: event.type
  external: ecs
- name: event.outcome
  external: ecs
- name: event.action
  external: ecs
- name: source.ip
  external: ecs
- name: source.port
  external: ecs
- name: destination.ip
  external: ecs
- name: user.name
  external: ecs
- name: related.ip
  external: ecs
- name: related.user
  external: ecs
当为ECS字段附加额外元数据时(例如将字段设为TSDB维度或带固定值的constant_keyword),请将
external: ecs
与该元数据结合使用。描述会从ECS继承。将定义放在
ecs.yml
中(路由常量放在
base-fields.yml
):
yaml
- name: observer.vendor
  external: ecs
  type: constant_keyword
  value: Acme Corp

fields.yml

fields.yml

Integration-specific custom (non-ECS) fields only. Use a nested
group
hierarchy for the vendor namespace:
yaml
- name: acme.firewall
  type: group
  fields:
    - name: rule_id
      type: keyword
    - name: policy_name
      type: keyword
    - name: bytes_in
      type: long
      unit: byte
      metric_type: gauge
Groups do not need to be declared as
type: object
— defining a
group
with nested
fields
is sufficient. The object structure is implicit.
仅包含集成专属的自定义(非ECS)字段。针对厂商命名空间使用嵌套的
group
层级结构:
yaml
- name: acme.firewall
  type: group
  fields:
    - name: rule_id
      type: keyword
    - name: policy_name
      type: keyword
    - name: bytes_in
      type: long
      unit: byte
      metric_type: gauge
无需将组声明为
type: object
——定义带嵌套
fields
group
即可。对象结构是隐式的。

labels.*
exception

labels.*
例外情况

labels
is a core ECS object (
type: object
,
object_type: keyword
) designed for ad-hoc key-value metadata. Subkeys under
labels.*
do not require vendor namespacing — this is the one exception to the vendor-prefix rule.
Use
labels.*
for simple keyword flags or integration-internal markers (e.g.,
labels.is_ioc_transform_source
). Use the vendor namespace for structured or nested data from an upstream source.
labels
是ECS的核心对象(
type: object
object_type: keyword
),用于临时键值元数据。
labels.*
下的子键无需厂商命名空间——这是厂商前缀规则的唯一例外。
对于简单的关键字标记或集成内部标识(例如
labels.is_ioc_transform_source
),请使用
labels.*
。对于上游来源的结构化或嵌套数据,请使用厂商命名空间。

Flags vs structured data

标记与结构化数据

Boolean flags and simple tags can live flat under the vendor group:
yaml
- name: acme.firewall
  type: group
  fields:
    - name: is_encrypted
      type: boolean
    - name: policy_name
      type: keyword
Structured data from the source should use sub-groups for logical hierarchy:
yaml
- name: acme.firewall
  type: group
  fields:
    - name: rule
      type: group
      fields:
        - name: id
          type: keyword
        - name: name
          type: keyword
        - name: action
          type: keyword
布尔标记和简单标签可以直接放在厂商组下:
yaml
- name: acme.firewall
  type: group
  fields:
    - name: is_encrypted
      type: boolean
    - name: policy_name
      type: keyword
来源的结构化数据应使用子组构建逻辑层级:
yaml
- name: acme.firewall
  type: group
  fields:
    - name: rule
      type: group
      fields:
        - name: id
          type: keyword
        - name: name
          type: keyword
        - name: action
          type: keyword

agent.yml

agent.yml

Non-ECS fields populated by the Elastic Agent or Beats framework but not covered by ECS. Include only when the input type emits these fields. Typical fields:
cloud.image.id
,
cloud.instance.id
,
host.containerized
,
host.os.build
,
host.os.codename
,
input.type
,
log.offset
.
See
references/root-and-core-fields.md
for full YAML samples.
由Elastic Agent或Beats框架填充但未被ECS覆盖的非ECS字段。仅当输入类型会输出这些字段时才包含。典型字段包括:
cloud.image.id
cloud.instance.id
host.containerized
host.os.build
host.os.codename
input.type
log.offset
完整YAML示例请参见
references/root-and-core-fields.md

beats.yml

beats.yml

Filebeat/Beats-specific fields not covered by ECS. Minimal form contains
input.type
and
log.offset
. Some inputs also emit
log.flags
or
log.file.*
sub-fields.
See
references/root-and-core-fields.md
for full YAML samples.
未被ECS覆盖的Filebeat/Beats专属字段。最简形式包含
input.type
log.offset
。部分输入还会输出
log.flags
log.file.*
子字段。
完整YAML示例请参见
references/root-and-core-fields.md

ECS field selection

ECS字段选择

Prefer ECS fields whenever semantics match. If no ECS field exists for the data, add it under the package namespace in
fields.yml
.
只要语义匹配,优先使用ECS字段。如果ECS中没有对应数据的字段,请在
fields.yml
的包命名空间下添加该字段。

Categorization quick reference

分类速查表

FieldTypeNotes
event.kind
keywordHighest-level classification.
event.category
keyword[]Broad domain buckets — always an array.
event.type
keyword[]Sub-buckets within category — always an array.
event.outcome
keyword
success
,
failure
,
unknown
; only set when meaningful.
  • event.kind
    :
    alert
    ,
    asset
    ,
    enrichment
    ,
    event
    ,
    metric
    ,
    pipeline_error
    ,
    signal
    ,
    state
  • event.category
    :
    api
    ,
    authentication
    ,
    configuration
    ,
    database
    ,
    driver
    ,
    email
    ,
    file
    ,
    host
    ,
    iam
    ,
    intrusion_detection
    ,
    library
    ,
    malware
    ,
    network
    ,
    package
    ,
    process
    ,
    registry
    ,
    session
    ,
    threat
    ,
    vulnerability
    ,
    web
  • event.type
    :
    access
    ,
    admin
    ,
    allowed
    ,
    change
    ,
    connection
    ,
    creation
    ,
    deletion
    ,
    denied
    ,
    device
    ,
    end
    ,
    error
    ,
    group
    ,
    indicator
    ,
    info
    ,
    installation
    ,
    protocol
    ,
    start
    ,
    user
Decision workflow:
  1. event.kind
    :
    event
    for normal logs,
    metric
    for measurements,
    state
    for snapshots,
    pipeline_error
    in
    on_failure
  2. event.category
    : one or more values (array) for the broad domain
  3. event.type
    : one or more values (array) for operation style
  4. event.outcome
    : only when a clear success/failure/unknown applies; omit for informational/metric events
  5. If no allowed value fits, leave the field empty — do not invent values
Use
event.action
for source-specific verbs (
blocked
,
dropped
,
authenticated
).
See
references/categorization-cheatsheet.md
for full worked examples.
字段类型说明
event.kind
keyword最高层级分类。
event.category
keyword[]宽泛领域分组——始终为数组。
event.type
keyword[]分类下的子分组——始终为数组。
event.outcome
keyword
success
failure
unknown
;仅在有意义时设置。
  • event.kind
    :
    alert
    ,
    asset
    ,
    enrichment
    ,
    event
    ,
    metric
    ,
    pipeline_error
    ,
    signal
    ,
    state
  • event.category
    :
    api
    ,
    authentication
    ,
    configuration
    ,
    database
    ,
    driver
    ,
    email
    ,
    file
    ,
    host
    ,
    iam
    ,
    intrusion_detection
    ,
    library
    ,
    malware
    ,
    network
    ,
    package
    ,
    process
    ,
    registry
    ,
    session
    ,
    threat
    ,
    vulnerability
    ,
    web
  • event.type
    :
    access
    ,
    admin
    ,
    allowed
    ,
    change
    ,
    connection
    ,
    creation
    ,
    deletion
    ,
    denied
    ,
    device
    ,
    end
    ,
    error
    ,
    group
    ,
    indicator
    ,
    info
    ,
    installation
    ,
    protocol
    ,
    start
    ,
    user
决策流程:
  1. event.kind
    :普通日志用
    event
    ,测量数据用
    metric
    ,快照用
    state
    on_failure
    中用
    pipeline_error
  2. event.category
    :为宽泛领域设置一个或多个值(数组)
  3. event.type
    :为操作类型设置一个或多个值(数组)
  4. event.outcome
    :仅当存在明确的成功/失败/未知结果时设置;信息类/指标类事件可省略
  5. 如果没有合适的允许值,请留空字段——不要自行创建值
针对来源特定的动词(如
blocked
dropped
authenticated
),请使用
event.action
完整示例请参见
references/categorization-cheatsheet.md

Timestamp fields

时间戳字段

ECS defines several timestamp fields with distinct semantics. Use them correctly:
FieldWhen to useSet by
@timestamp
The primary event timestamp. Parse from the source event data. Required.Integration pipeline
event.created
When the event was first created or recorded by the source system, if different from
@timestamp
.
Integration pipeline
event.start
When an activity or period began (e.g., session start, connection start).Integration pipeline
event.end
When an activity or period ended (e.g., session end, connection close).Integration pipeline
event.ingested
When the event was ingested into Elasticsearch.Elasticsearch (outside the integration)
event.ingested
must NEVER be set by an integration pipeline.
It is managed automatically by Elasticsearch's final pipeline. Do not add a
set
processor for
event.ingested
.
When the source data contains multiple timestamps:
  1. Map the primary event timestamp to
    @timestamp
    .
  2. If another timestamp represents when the event was first recorded/created, map it to
    event.created
    .
  3. If timestamps represent the start or end of an activity, map them to
    event.start
    and
    event.end
    .
  4. If a timestamp does not match the semantics of any of the above, map it to a custom field under the vendor namespace with
    type: date
    in
    fields.yml
    .
ECS定义了多个具有不同语义的时间戳字段,请正确使用:
字段使用场景设置方
@timestamp
主事件时间戳。从源事件数据中解析。必填。集成管道
event.created
当事件首次被源系统创建或记录的时间与
@timestamp
不同时使用。
集成管道
event.start
活动或周期开始的时间(例如会话开始、连接建立)。集成管道
event.end
活动或周期结束的时间(例如会话结束、连接关闭)。集成管道
event.ingested
事件被Elasticsearch摄入的时间。Elasticsearch(集成外部)
event.ingested
绝不能由集成管道设置。
它由Elasticsearch的最终管道自动管理。请勿为
event.ingested
添加
set
处理器。
当源数据包含多个时间戳时:
  1. 将主事件时间戳映射到
    @timestamp
  2. 如果另一个时间戳表示事件首次记录/创建的时间,将其映射到
    event.created
  3. 如果时间戳表示活动的开始或结束,将其映射到
    event.start
    event.end
  4. 如果时间戳不符合上述任何语义,请在
    fields.yml
    中将其映射到厂商命名空间下的自定义字段,并设置
    type: date

Reusable fieldset nesting rules

可复用字段集嵌套规则

Some ECS field sets must be nested under a parent entity — they are not valid at document root.
geo
— must be nested under:
client.geo
,
destination.geo
,
host.geo
,
observer.geo
,
server.geo
,
source.geo
,
threat.indicator.geo
Root-level
geo.*
fields are not recognized and will appear unmapped. Always set
target_field
on the
geoip
processor:
yaml
- geoip:
    field: source.ip
    target_field: source.geo
    ignore_missing: true
as
(Autonomous System) — nested under:
client.as
,
destination.as
,
server.as
,
source.as
When using
geoip
for geolocation, always also perform an ASN lookup using
GeoLite2-ASN.mmdb
and rename the raw output fields to ECS names. The
geoip
ASN processor outputs
asn
and
organization_name
, which must be renamed to
as.number
and
as.organization.name
:
yaml
- geoip:
    database_file: GeoLite2-ASN.mmdb
    field: source.ip
    target_field: source.as
    properties:
      - asn
      - organization_name
    ignore_missing: true
- rename:
    field: source.as.asn
    target_field: source.as.number
    ignore_missing: true
- rename:
    field: source.as.organization_name
    target_field: source.as.organization.name
    ignore_missing: true
See the
ingest-pipelines
skill →
references/processor-cookbook.md
for the full geo+ASN pattern with both source and destination.
os
— nested under:
host.os
,
observer.os
,
user_agent.os
部分ECS字段集必须嵌套在父实体下——不能直接放在文档根节点。
geo
— 必须嵌套在以下父节点下:
client.geo
,
destination.geo
,
host.geo
,
observer.geo
,
server.geo
,
source.geo
,
threat.indicator.geo
根节点级别的
geo.*
字段不会被识别,会显示为未映射。请始终在
geoip
处理器上设置
target_field
yaml
- geoip:
    field: source.ip
    target_field: source.geo
    ignore_missing: true
as
(自治系统)——嵌套在以下父节点下:
client.as
,
destination.as
,
server.as
,
source.as
使用
geoip
进行地理定位时,还需使用
GeoLite2-ASN.mmdb
执行ASN查询,并将原始输出字段重命名为ECS名称。
geoip
ASN处理器输出
asn
organization_name
,必须将其重命名为
as.number
as.organization.name
yaml
- geoip:
    database_file: GeoLite2-ASN.mmdb
    field: source.ip
    target_field: source.as
    properties:
      - asn
      - organization_name
    ignore_missing: true
- rename:
    field: source.as.asn
    target_field: source.as.number
    ignore_missing: true
- rename:
    field: source.as.organization_name
    target_field: source.as.organization.name
    ignore_missing: true
包含源端和目的端的完整geo+ASN模式,请参见
ingest-pipelines
技能 →
references/processor-cookbook.md
os
— 嵌套在以下父节点下:
host.os
,
observer.os
,
user_agent.os

Nested (array-of-objects) ECS fields

嵌套(对象数组)ECS字段

Some ECS fields use
type: nested
, meaning they hold an array of objects where each object groups related sub-fields together. The pipeline must produce this structure — do not flatten these into parallel scalar arrays.
ECS fields that use
nested
type:
FieldContains
email.attachments
file.name
,
file.size
,
file.extension
,
file.mime_type
,
file.hash.*
threat.enrichments
indicator.*
,
matched.*
threat.indicator.file.elf.sections
name
,
physical_size
,
virtual_size
, etc.
threat.indicator.file.pe.sections
name
,
physical_size
,
virtual_size
, etc.
process.elf.sections
name
,
physical_size
,
virtual_size
, etc.
process.pe.sections
name
,
physical_size
,
virtual_size
, etc.
Anti-pattern — parallel arrays (WRONG):
json
{
  "email": {
    "attachments": {
      "file": {
        "name": ["a.pdf", "b.pdf"],
        "size": [1024, 2048]
      }
    }
  }
}
This loses the association between each attachment's name and size. Queries cannot isolate individual objects.
Correct — array of objects:
json
{
  "email": {
    "attachments": [
      { "file": { "name": "a.pdf", "size": 1024 } },
      { "file": { "name": "b.pdf", "size": 2048 } }
    ]
  }
}
ecs.yml
declaration:
declare only the parent
nested
field with
external: ecs
. Child fields (
email.attachments.file.name
, etc.) inherit their types from the ECS schema — do not redeclare them individually.
yaml
- name: email.attachments
  external: ecs
Pipeline construction: when source data delivers attachment metadata as separate parallel arrays (e.g., a comma-separated list of filenames and a separate list of sizes), use a
script
processor to zip them into an array of objects. See
ingest-pipelines
references/painless-patterns.md
for array construction patterns and
references/processor-cookbook.md
Foreach semantics for iterating over array elements.
yaml
- script:
    tag: build_email_attachments
    description: Build email.attachments as array of nested objects from parallel source arrays.
    lang: painless
    if: ctx.json?.file_names instanceof List && ctx.json?.file_sizes instanceof List
    source: |-
      def names = ctx.json.file_names;
      def sizes = ctx.json.file_sizes;
      int len = Math.min(names.size(), sizes.size());
      def attachments = new ArrayList(len);
      for (int i = 0; i < len; i++) {
        def attachment = new HashMap();
        def file = new HashMap();
        file.put('name', names.get(i));
        file.put('size', sizes.get(i));
        attachment.put('file', file);
        attachments.add(attachment);
      }
      ctx.email = ctx.email ?: [:];
      ctx.email.attachments = attachments;
When source data already delivers each attachment as a separate object (e.g., a JSON array of attachment objects), no zipping is needed — use
rename
or
set
with
copy_from
to place the array at
email.attachments
directly.
部分ECS字段使用
type: nested
,表示它们存储对象数组,其中每个对象将相关子字段分组在一起。管道必须生成此结构——请勿将其扁平化为并行标量数组。
使用
nested
类型的ECS字段:
字段包含内容
email.attachments
file.name
,
file.size
,
file.extension
,
file.mime_type
,
file.hash.*
threat.enrichments
indicator.*
,
matched.*
threat.indicator.file.elf.sections
name
,
physical_size
,
virtual_size
, etc.
threat.indicator.file.pe.sections
name
,
physical_size
,
virtual_size
, etc.
process.elf.sections
name
,
physical_size
,
virtual_size
, etc.
process.pe.sections
name
,
physical_size
,
virtual_size
, etc.
反模式——并行数组(错误):
json
{
  "email": {
    "attachments": {
      "file": {
        "name": ["a.pdf", "b.pdf"],
        "size": [1024, 2048]
      }
    }
  }
}
这会丢失每个附件名称与大小之间的关联。查询无法隔离单个对象。
正确方式——对象数组:
json
{
  "email": {
    "attachments": [
      { "file": { "name": "a.pdf", "size": 1024 } },
      { "file": { "name": "b.pdf", "size": 2048 } }
    ]
  }
}
ecs.yml
声明:
仅使用
external: ecs
声明父级
nested
字段。子字段(如
email.attachments.file.name
)会从ECS schema继承类型——无需单独重新声明。
yaml
- name: email.attachments
  external: ecs
管道构建: 当源数据将附件元数据作为单独的并行数组传递时(例如逗号分隔的文件名列表和单独的大小列表),请使用
script
处理器将它们合并为对象数组。数组构建模式请参见
ingest-pipelines
references/painless-patterns.md
,遍历数组元素请参见
references/processor-cookbook.md
Foreach语义
yaml
- script:
    tag: build_email_attachments
    description: Build email.attachments as array of nested objects from parallel source arrays.
    lang: painless
    if: ctx.json?.file_names instanceof List && ctx.json?.file_sizes instanceof List
    source: |-
      def names = ctx.json.file_names;
      def sizes = ctx.json.file_sizes;
      int len = Math.min(names.size(), sizes.size());
      def attachments = new ArrayList(len);
      for (int i = 0; i < len; i++) {
        def attachment = new HashMap();
        def file = new HashMap();
        file.put('name', names.get(i));
        file.put('size', sizes.get(i));
        attachment.put('file', file);
        attachments.add(attachment);
      }
      ctx.email = ctx.email ?: [:];
      ctx.email.attachments = attachments;
当源数据已将每个附件作为单独对象传递时(例如附件对象的JSON数组),无需合并——使用
rename
或带
copy_from
set
将数组直接放置在
email.attachments
即可。

Custom field types

自定义字段类型

For non-ECS fields in
fields.yml
:
  • keyword
    for identifiers and exact-match strings
  • constant_keyword
    for fixed values (dataset/module constants)
  • long
    ,
    double
    ,
    scaled_float
    for metrics and numeric values
  • date
    /
    date_nanos
    for timestamps (
    date_nanos
    only when sub-millisecond precision is truly needed)
  • ip
    for IP addresses
  • boolean
    for true/false (avoid string booleans in pipelines)
  • geo_point
    for lat/lon coordinates
  • group
    with nested
    fields
    for logical structure — no need to separately declare intermediate
    object
    nodes
  • flattened
    for arbitrary key/value blobs with unknown keys
  • nested
    for arrays of objects requiring per-object query isolation (heavier than group)
  • text
    /
    match_only_text
    for full-text content; add a
    keyword
    sub-field via
    multi_fields
    when aggregation is also needed
Useful properties on numeric fields:
metric_type
(
gauge
or
counter
),
unit
(e.g.,
byte
,
percent
,
ms
),
dimension
for low-cardinality TSDB fields.
See
references/mapping-type-matrix.md
for the full type reference.
对于
fields.yml
中的非ECS字段:
  • keyword
    :用于标识符和精确匹配字符串
  • constant_keyword
    :用于固定值(数据集/模块常量)
  • long
    double
    scaled_float
    :用于指标和数值
  • date
    /
    date_nanos
    :用于时间戳(仅当确实需要亚毫秒精度时使用
    date_nanos
  • ip
    :用于IP地址
  • boolean
    :用于布尔值(管道中避免使用字符串形式的布尔值)
  • geo_point
    :用于经纬度坐标
  • group
    带嵌套
    fields
    :用于逻辑结构——无需单独声明中间
    object
    节点
  • flattened
    :用于键未知的任意键值 blob
  • nested
    :用于需要按对象隔离查询的对象数组(比group开销更大)
  • text
    /
    match_only_text
    :用于全文内容;当需要聚合时,通过
    multi_fields
    添加
    keyword
    子字段
数值字段的有用属性:
metric_type
gauge
counter
)、
unit
(例如
byte
percent
ms
)、低基数TSDB字段的
dimension
完整类型参考请参见
references/mapping-type-matrix.md

Field naming conventions

字段命名规范

RuleDODON'T
Use snake_case
user_name
,
request_count
userName
,
RequestCount
Use lowercase
source_ip
Source_IP
No asterisks in names
network.bytes
network.*
(literal asterisk)
Use groups for hierarchy
vendor.module.field
as nested group
vendor.module.field
as flat dotted name
Field names must never contain literal
*
characters. An asterisk in a field name is almost always a copy-paste error from documentation or wildcard patterns. Use a
group
with known subfields or
flattened
for dynamic keys instead.
规则正确做法错误做法
使用蛇形命名法
user_name
,
request_count
userName
,
RequestCount
使用小写
source_ip
Source_IP
名称中不含星号
network.bytes
network.*
(字面星号)
使用组实现层级嵌套group形式的
vendor.module.field
扁平点分形式的
vendor.module.field
字段名称绝不能包含字面意义上的
*
字符。字段名称中的星号几乎总是来自文档或通配符模式的复制粘贴错误。请改用带已知子字段的
group
或用于动态键的
flattened

Dotted field names vs nested groups

点分字段名称 vs 嵌套组

Both styles are valid in field files:
yaml
undefined
两种格式在字段文件中均有效:
yaml
undefined

Dotted (flat) — common for ECS fields in ecs.yml

点分(扁平)——ecs.yml中ECS字段常用格式

  • name: source.ip external: ecs
  • name: source.ip external: ecs

Nested group — common for custom fields

嵌套组——自定义字段常用格式

  • name: acme.firewall type: group fields:
    • name: rule_id type: keyword

Pipeline expected output (`*-expected.json`) always uses nested object form regardless of how the source data represented the field. A source `"host.name": "myhost"` produces `{"host": {"name": "myhost"}}` in the output.

When source data contains literal dotted keys that Elasticsearch would otherwise expand, use `dot_expander`:

```yaml
- dot_expander:
    field: "*"
    override: true
  • name: acme.firewall type: group fields:
    • name: rule_id type: keyword

管道预期输出(`*-expected.json`)始终使用嵌套对象形式,无论源数据如何表示该字段。源数据中的`"host.name": "myhost"`在输出中会变为`{"host": {"name": "myhost"}}`。

当源数据包含Elasticsearch会自动展开的字面点分键时,请使用`dot_expander`:

```yaml
- dot_expander:
    field: "*"
    override: true

geo_point field handling

geo_point字段处理

In pipeline test expected outputs,
geo_point
fields appear as objects with
lat
and
lon
keys:
json
"source": {
  "geo": {
    "location": { "lat": 51.5142, "lon": -0.0931 },
    "city_name": "London",
    "country_iso_code": "GB"
  }
}
These sub-fields do not need entries in
fields.yml
— they are part of the
geo_point
type mapping. Only the
*.geo.location
field (type
geo_point
) needs to be in
ecs.yml
for non-standard parent prefixes where
ecs@mappings
does not apply.
在管道测试预期输出中,
geo_point
字段显示为带
lat
lon
键的对象:
json
"source": {
  "geo": {
    "location": { "lat": 51.5142, "lon": -0.0931 },
    "city_name": "London",
    "country_iso_code": "GB"
  }
}
这些子字段无需在
fields.yml
中添加条目——它们是
geo_point
类型映射的一部分。对于
ecs@mappings
不适用的非标准父前缀,仅需在
ecs.yml
中添加
*.geo.location
字段(类型为
geo_point
)。

Common pipeline categorization patterns

常见管道分类模式

Web access

Web访问

yaml
- set:
    field: event.kind
    value: event
- append:
    field: event.category
    value: web
- append:
    field: event.type
    value: access
yaml
- set:
    field: event.kind
    value: event
- append:
    field: event.category
    value: web
- append:
    field: event.type
    value: access

Outcome from HTTP status

基于HTTP状态的结果

yaml
- set:
    field: event.outcome
    value: success
    if: "ctx?.http?.response?.status_code != null && ctx.http.response.status_code < 400"
- set:
    field: event.outcome
    value: failure
    if: "ctx?.http?.response?.status_code != null && ctx.http.response.status_code >= 400"
yaml
- set:
    field: event.outcome
    value: success
    if: "ctx?.http?.response?.status_code != null && ctx.http.response.status_code < 400"
- set:
    field: event.outcome
    value: failure
    if: "ctx?.http?.response?.status_code != null && ctx.http.response.status_code >= 400"

Pipeline error fallback

管道错误回退

yaml
on_failure:
  - set:
      field: event.kind
      value: pipeline_error
yaml
on_failure:
  - set:
      field: event.kind
      value: pipeline_error

Troubleshooting: "field X is undefined" for ECS fields

故障排查:ECS字段提示“field X is undefined”

When tests report
field "destination.ip" is undefined
for standard ECS fields:
  1. Check
    _dev/build/build.yml
    exists at the package root
  2. Check
    dependencies.ecs.reference
    is set (use
    git@v9.3.0
    )
  3. Check the field is listed in
    ecs.yml
    with
    external: ecs
Fix the root cause. Do not work around it by:
  • Adding ECS fields with full type definitions to
    fields.yml
    without
    external: ecs
  • Skipping
    external: ecs
    and defining ECS field types/descriptions manually
Exception: Custom (non-ECS) fields reported as undefined must be defined in
fields.yml
.
当测试针对标准ECS字段报告
field "destination.ip" is undefined
时:
  1. 检查包根目录下是否存在
    _dev/build/build.yml
  2. 检查是否设置了
    dependencies.ecs.reference
    (建议使用
    git@v9.3.0
  3. 检查该字段是否在
    ecs.yml
    中列出并使用
    external: ecs
修复根本原因,请勿通过以下方式规避:
  • fields.yml
    中添加不带
    external: ecs
    的完整ECS字段类型定义
  • 跳过
    external: ecs
    并手动定义ECS字段类型/描述
例外情况: 报告为未定义的自定义(非ECS)字段必须在
fields.yml
中定义。

Common failure patterns

常见失败模式

  • missing
    _dev/build/build.yml
    — all ECS fields reported undefined; create with
    dependencies.ecs.reference
  • outdated ECS version in
    build.yml
    — fields from newer ECS versions undefined; update reference to
    git@v9.3.0
  • ECS field set in pipeline but missing from
    ecs.yml
    — field is undefined in test schema validation; add it to
    ecs.yml
  • ECS field defined without
    external: ecs
    — descriptions and types diverge from ECS; always use
    external: ecs
    for ECS fields, with overrides as needed
  • metric_type
    on non-numeric field
    — lint error
  • geo.*
    at document root
    — unmapped; always nest under a parent entity
  • event.category
    or
    event.type
    set as scalar
    — must use
    append
    processor, not
    set
  • nested
    ECS field mapped as parallel arrays
    email.attachments
    ,
    threat.enrichments
    , and similar
    nested
    fields must be arrays of objects, not objects with parallel scalar arrays; see the Nested (array-of-objects) ECS fields section above
  • 缺失
    _dev/build/build.yml
    — 所有ECS字段均报告未定义;创建该文件并设置
    dependencies.ecs.reference
  • build.yml
    中ECS版本过时
    — 新版本ECS中的字段未定义;将引用更新为
    git@v9.3.0
  • 管道中设置了ECS字段但未在
    ecs.yml
    中列出
    — 测试schema验证时字段未定义;将其添加到
    ecs.yml
  • ECS字段定义未使用
    external: ecs
    — 描述和类型与ECS不一致;ECS字段请始终使用
    external: ecs
    ,必要时可覆盖属性
  • 非数值字段设置
    metric_type
    — 代码检查错误
  • geo.*
    位于文档根节点
    — 未映射;请始终嵌套在父实体下
  • event.category
    event.type
    设置为标量
    — 必须使用
    append
    处理器,而非
    set
  • nested
    ECS字段映射为并行数组
    email.attachments
    threat.enrichments
    及类似
    nested
    字段必须是对象数组,而非带并行标量数组的对象;请参见上方的嵌套(对象数组)ECS字段部分

Validation loop

验证循环

bash
elastic-package lint
elastic-package check
elastic-package test pipeline --data-streams <stream>
bash
elastic-package lint
elastic-package check
elastic-package test pipeline --data-streams <stream>

References

参考资料

  • references/mapping-type-matrix.md
  • references/categorization-cheatsheet.md
  • references/root-and-core-fields.md
  • references/fieldset-links.md
  • ECS field reference
  • references/mapping-type-matrix.md
  • references/categorization-cheatsheet.md
  • references/root-and-core-fields.md
  • references/fieldset-links.md
  • ECS字段参考