cx-telemetry-querying
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseTelemetry Querying Skill
Telemetry Querying Skill
Use this skill as the entry point for any investigation, debugging, or data question that may be answered from telemetry data. This skill helps you decide where the relevant signal lives (metrics, logs, traces, RUM, APM) before diving into queries, then delegates to specialized skills for deep exploration.
将此Skill用作任何可通过遥测数据解答的调查、调试或数据问题的入口。该Skill可帮助你在深入查询前确定相关信号所在的位置(指标、日志、追踪、RUM、APM),然后将任务委托给专业Skill进行深度探索。
Core Principle
核心原则
Decide where to look before querying. Telemetry data is spread across multiple pillars. Choosing the right source first saves time and yields better answers.
查询前先确定数据源。遥测数据分布在多个支柱中。优先选择正确的数据源可节省时间并获得更准确的答案。
Safety
安全性
All query commands (, , , , ) are read-only and work in mode. They never modify data and can be run freely without .
cx logscx spanscx metricscx dataprimecx search-fields--read-only--yesWhen running inside an AI agent, read commands are unaffected by agent mode detection - no confirmation is needed for queries.
所有查询命令(、、、、)均为只读模式,运行在模式下。它们绝不会修改数据,无需添加即可自由运行。
cx logscx spanscx metricscx dataprimecx search-fields--read-only--yes在AI Agent中运行时,读取命令不受Agent模式检测影响——查询无需确认。
Quick Routing Guide
快速路由指南
Use this table for obvious cases where one pillar is the clear first choice:
| Question Type | First Choice | Fallback |
|---|---|---|
| UI behavior, page load, frontend errors | RUM | Traces (if backend-related) |
| Endpoint latency, throughput, error rates | Metrics | Traces (for per-request detail) |
| Service-to-service dependencies, request flow | Traces | Logs (for debug output) |
| Specific error messages, stack traces | Logs | Traces (for request context) |
| Infrastructure health (CPU, memory, disk) | Metrics | - |
| Business events (purchases, signups) | Depends - see Discovery Workflow | - |
For ambiguous questions (e.g., "How much money did users spend last week?"), the signal could live in any pillar. Follow the Discovery Workflow below.
对于明显可确定首选数据源的场景,可使用下表:
| 问题类型 | 首选数据源 | 备选方案 |
|---|---|---|
| UI行为、页面加载、前端错误 | RUM | Traces(若与后端相关) |
| 端点延迟、吞吐量、错误率 | 指标 | Traces(用于单请求细节) |
| 服务间依赖、请求流 | Traces | 日志(用于调试输出) |
| 特定错误信息、堆栈追踪 | 日志 | Traces(用于请求上下文) |
| 基础设施健康(CPU、内存、磁盘) | 指标 | - |
| 业务事件(购买、注册) | 视情况而定——参见发现流程 | - |
对于模糊问题(例如:“上周用户在平台上花费了多少钱?”),信号可能存在于任意支柱中,请遵循以下发现流程。
Discovery Workflow
发现流程
When the answer could reside in multiple pillars, run discovery in parallel to find the best source.
当答案可能存在于多个支柱中时,并行执行发现操作以找到最佳数据源。
Step 1: Search Metrics
步骤1:搜索指标
Check if a relevant metric exists:
bash
cx metrics search --name '*transaction*'
cx metrics search --name '*payment*'
cx metrics search --name '*revenue*'
cx metrics search --description "total purchase amount"If a matching metric is found, continue with the skill.
cx-metrics-query检查是否存在相关指标:
bash
cx metrics search --name '*transaction*'
cx metrics search --name '*payment*'
cx metrics search --name '*revenue*'
cx metrics search --description "total purchase amount"如果找到匹配指标,请继续使用 Skill。
cx-metrics-queryStep 2: Search Log and Span Fields
步骤2:搜索日志和Span字段
Use semantic field search to find relevant DataPrime paths:
bash
cx search-fields "transaction amount" --dataset logs
cx search-fields "payment total" --dataset spans
cx search-fields "purchase value" --dataset logs --limit 10Requirements: needs a Coralogix API key or OAuth on the active profile. If credentials are missing, prompt the user to run .
cx search-fieldscx profiles addIf matching fields are found:
- For logs: continue with the skill using DataPrime
cx-query-logs - For spans: continue with the skill
cx-query-spans
使用语义字段搜索查找相关DataPrime路径:
bash
cx search-fields "transaction amount" --dataset logs
cx search-fields "payment total" --dataset spans
cx search-fields "purchase value" --dataset logs --limit 10要求:需要Coralogix API密钥或活动配置文件中的OAuth。如果缺少凭据,请提示用户运行。
cx search-fieldscx profiles add如果找到匹配字段:
- 对于日志:使用DataPrime继续执行Skill
cx-query-logs - 对于Spans:继续执行Skill
cx-query-spans
Step 3: Search the Codebase
步骤3:搜索代码库
When discovery results are ambiguous or you need to validate what a metric/field actually represents, search the codebase:
- Look for metric registration code (e.g., ,
prometheus.NewCounter)metrics.record - Look for log statements that emit the field (e.g., )
logger.info("transaction", ...) - Look for span attributes (e.g., )
span.setAttribute("purchase.amount", ...)
This confirms the semantic meaning and helps you choose the right pillar.
当发现结果模糊或需要验证指标/字段的实际含义时,搜索代码库:
- 查找指标注册代码(例如:、
prometheus.NewCounter)metrics.record - 查找输出该字段的日志语句(例如:)
logger.info("transaction", ...) - 查找Span属性(例如:)
span.setAttribute("purchase.amount", ...)
这可以确认语义含义,并帮助你选择正确的支柱。
Step 4: Choose and Query
步骤4:选择并查询
Based on discovery results, pick the pillar with the clearest signal and delegate to the appropriate skill:
| Pillar | Skill to Use |
|---|---|
| Metrics | |
| Logs | |
| Traces/Spans | |
| RUM | |
| APM | APM-specific guidance |
根据发现结果,选择信号最清晰的支柱,并委托给相应的Skill:
| 支柱 | 使用的Skill |
|---|---|
| 指标 | |
| 日志 | |
| 追踪/Span | |
| RUM | |
| APM | APM专属指南 |
Fallback and Pivoting
备选方案与转向
If your initial route yields no results, pivot to another pillar.
Example pivot paths:
- Metrics empty → try traces (per-request data) or logs (event records)
- Logs empty → try traces (structured span attributes) or metrics (aggregated counters)
- Traces empty → try logs (text-based debug output)
Do not stop after one failed attempt. Try at least two pillars before concluding the data does not exist.
如果初始路由没有结果,请转向其他支柱。
示例转向路径:
- 指标无数据 → 尝试Traces(单请求数据)或日志(事件记录)
- 日志无数据 → 尝试Traces(结构化Span属性)或指标(聚合计数器)
- Traces无数据 → 尝试日志(文本型调试输出)
不要在一次尝试失败后就停止。在得出数据不存在的结论前,至少尝试两个支柱。
CLI Commands Reference
CLI命令参考
| Command | Purpose | When to Use |
|---|---|---|
| Output the full command tree as JSON | Discover all available commands and their flags |
| Find metrics by name | First step for metrics discovery |
| Semantic metric search | When you know what you want but not the name |
| Find log fields by description | Discovery for log-based questions |
| Find span fields by description | Discovery for trace-based questions |
| Search spans by service | When investigating a specific service |
| List DataPrime commands/functions | When building log queries |
| 命令 | 用途 | 使用场景 |
|---|---|---|
| 以JSON格式输出完整命令树 | 发现所有可用命令及其参数 |
| 按名称查找指标 | 指标发现的第一步 |
| 语义指标搜索 | 知道需求但不知道指标名称时 |
| 按描述查找日志字段 | 基于日志的问题发现 |
| 按描述查找Span字段 | 基于追踪的问题发现 |
| 按服务搜索Span | 调查特定服务时 |
| 列出DataPrime命令/函数 | 构建日志查询时 |
Examples
示例
Example 1: Business Question (Ambiguous Source)
示例1:业务问题(数据源模糊)
Question: "How much money did people spend on the platform last week?"
Approach:
- Search metrics: and
cx metrics search --name '*revenue*'cx metrics search --name '*transaction*' - Search log fields:
cx search-fields "transaction amount" --dataset logs - Search span fields:
cx search-fields "payment total" --dataset spans - If a metric like exists, use
payment_total_usdskill with a range querycx-metrics-query - If only logs have the data, use skill with DataPrime aggregation
cx-query-logs - If traces have attribute, use
purchase.amountskillcx-query-spans
问题:“上周用户在平台上花费了多少钱?”
方法:
- 搜索指标:和
cx metrics search --name '*revenue*'cx metrics search --name '*transaction*' - 搜索日志字段:
cx search-fields "transaction amount" --dataset logs - 搜索Span字段:
cx search-fields "payment total" --dataset spans - 如果存在之类的指标,使用
payment_total_usdSkill执行范围查询cx-metrics-query - 如果只有日志包含数据,使用Skill结合DataPrime聚合
cx-query-logs - 如果Traces包含属性,使用
purchase.amountSkillcx-query-spans
Example 2: Latency Question (Clear First Choice)
示例2:延迟问题(明确首选数据源)
Question: "What's the average latency of the checkout route?"
Approach:
- First try metrics: or
cx metrics search --name '*checkout*latency*'cx metrics search --name '*http*duration*' - If a histogram metric exists, use skill with
cx-metrics-queryhistogram_quantile - If no metric, fall back to traces: and aggregate span durations
cx spans "filter $l.serviceName == 'checkout-service'" --limit 10
问题:“结账路由的平均延迟是多少?”
方法:
- 首先尝试指标:或
cx metrics search --name '*checkout*latency*'cx metrics search --name '*http*duration*' - 如果存在直方图指标,使用Skill结合
cx-metrics-queryhistogram_quantile - 如果没有指标,转向Traces:并聚合Span时长
cx spans "filter $l.serviceName == 'checkout-service'" --limit 10
Example 3: Frontend Performance (RUM)
示例3:前端性能(RUM)
Question: "Why is the dashboard page loading slowly for users?"
Approach:
- This is clearly a RUM question - frontend page load data
- Use skill directly
cx-rum - If RUM shows backend calls are slow, pivot to for the API calls
cx-query-spans
问题:“为什么用户的仪表板页面加载缓慢?”
方法:
- 这显然是RUM问题——前端页面加载数据
- 直接使用Skill
cx-rum - 如果RUM显示后端调用缓慢,转向Skill分析API调用
cx-query-spans
Example 4: Error Investigation (Logs + Traces)
示例4:错误调查(日志+Traces)
Question: "Why are users getting 500 errors on the payment endpoint?"
Approach:
- Check error rate metrics: →
cx metrics search --name '*error*'skillcx-metrics-query - Search for error logs: →
cx search-fields "error message" --dataset logsskillcx-query-logs - Get traces for failed requests: →
cx spans "filter $l.serviceName == 'payment-service'" --limit 10skillcx-query-spans - Cross-reference: find trace IDs in logs, then fetch full traces for root cause
问题:“为什么用户在支付端点收到500错误?”
方法:
- 检查错误率指标:→ 使用
cx metrics search --name '*error*'Skillcx-metrics-query - 搜索错误日志:→ 使用
cx search-fields "error message" --dataset logsSkillcx-query-logs - 获取失败请求的Traces:→ 使用
cx spans "filter $l.serviceName == 'payment-service'" --limit 10Skillcx-query-spans - 交叉引用:在日志中查找Trace ID,然后获取完整Traces以排查根本原因
Beyond Investigation
除调查外的场景
Not every question is answered by querying data. If the user's intent is operational rather than investigative, route to the appropriate workflow skill:
| User Intent | Route To |
|---|---|
| Reducing costs, checking usage, TCO policies | |
| Incident triage, SLO breaching, who got paged | |
| Setting up monitoring, webhooks, notifications | |
| Configuring parsing rules, enrichments, E2M | |
| Access audit, API keys, user management | |
| Creating or managing dashboards | |
并非所有问题都能通过查询数据解决。如果用户的意图是操作而非调查,请路由到相应的工作流Skill:
| 用户意图 | 路由至 |
|---|---|
| 成本优化、使用量检查、TCO策略 | |
| 事件分诊、SLO违规、通知对象确认 | |
| 监控设置、Webhook、通知配置 | |
| 解析规则、数据增强、E2M配置 | |
| 访问审计、API密钥、用户管理 | |
| 仪表板创建与管理 | |
Key Principles
关键原则
- Discover before querying: always run search/discovery to find the right source
- Parallel discovery: for ambiguous questions, search metrics, logs, and spans concurrently
- Validate with code: when unsure what a metric or field represents, check the codebase
- Pivot on failure: if one pillar is empty, try another before giving up
- Delegate to specialists: once you know the pillar, hand off to the dedicated skill
- 先发现再查询:始终先执行搜索/发现以找到正确的数据源
- 并行发现:对于模糊问题,同时搜索指标、日志和Span
- 通过代码验证:不确定指标或字段含义时,检查代码库
- 失败时转向:如果一个支柱无数据,尝试其他支柱后再放弃
- 委托给专业Skill:确定支柱后,转交给专用Skill处理
Related Skills
相关Skill
Investigation Skills
调查类Skill
- - DataPrime query language reference (syntax, operators, aggregations, functions)
cx-dataprime - - PromQL queries, metric discovery, instant and range queries
cx-metrics-query - - DataPrime log queries, log field exploration
cx-query-logs - - Trace search, span analysis, distributed tracing
cx-query-spans - - Frontend performance, user sessions, page loads
cx-rum - - Creating and managing alert definitions
cx-alerts - - Dashboard creation and management
cx-create-dashboard
- - DataPrime查询语言参考(语法、运算符、聚合、函数)
cx-dataprime - - PromQL查询、指标发现、即时查询与范围查询
cx-metrics-query - - DataPrime日志查询、日志字段探索
cx-query-logs - - 追踪搜索、Span分析、分布式追踪
cx-query-spans - - 前端性能、用户会话、页面加载
cx-rum - - 告警规则的创建与管理
cx-alerts - - 仪表板的创建与管理
cx-create-dashboard
Workflow Skills
工作流类Skill
- - Analyze and reduce Coralogix data costs
cx-cost-optimization - - Incident triage, SLO monitoring, notification verification
cx-incident-management - - Parsing rules, enrichments, E2M, recording rules
cx-data-pipeline - - Access audit, API keys, user and role management
cx-platform-admin - - Views, webhooks, notifications, integrations setup
cx-observability-setup
- - 分析并降低Coralogix数据成本
cx-cost-optimization - - 事件分诊、SLO监控、通知验证
cx-incident-management - - 解析规则、数据增强、E2M、记录规则
cx-data-pipeline - - 访问审计、API密钥、用户与角色管理
cx-platform-admin - - 视图、Webhook、通知、集成设置
cx-observability-setup