debug-pipeline

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Debug Pipeline

调试流水线

Diagnose pipeline executions and suggest fixes via MCP.
通过MCP诊断流水线执行情况并提出修复建议。

Instructions

操作说明

Step 1: Diagnose Execution (Preferred)

步骤1:诊断执行情况(推荐方式)

Use the dedicated diagnosis tool. It accepts an execution_id, pipeline_id (auto-fetches latest execution), or a Harness URL:
Call MCP tool: harness_diagnose
Parameters:
  pipeline_id: "<pipeline_identifier>"   # or execution_id or url
  org_id: "<organization>"
  project_id: "<project>"
This returns a structured report with stage/step breakdown, timing, bottlenecks, and failure details in one call. It also automatically follows chained (child) pipeline failures.
使用专用诊断工具。该工具支持传入execution_id、pipeline_id(自动获取最新执行实例)或Harness URL:
Call MCP tool: harness_diagnose
Parameters:
  pipeline_id: "<pipeline_identifier>"   # 或execution_id或url
  org_id: "<organization>"
  project_id: "<project>"
单次调用即可返回包含阶段/步骤细分、耗时、瓶颈及失败详情的结构化报告,还会自动追踪链式(子)流水线的失败情况。

Step 1b: Full Diagnostic Mode

步骤1b:完整诊断模式

For deeper analysis, request logs and pipeline YAML:
Call MCP tool: harness_diagnose
Parameters:
  execution_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"
  summary: false               # raw diagnostic payload
  include_yaml: true           # include pipeline definition
  include_logs: true           # include failed step logs
  log_snippet_lines: 120       # tail N lines per step (0 = unlimited)
  max_failed_steps: 5          # cap number of steps to fetch logs for
如需深度分析,可请求获取日志和流水线YAML:
Call MCP tool: harness_diagnose
Parameters:
  execution_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"
  summary: false               # 原始诊断负载
  include_yaml: true           # 包含流水线定义
  include_logs: true           # 包含失败步骤的日志
  log_snippet_lines: 120       # 每个步骤的末尾N行日志(0表示无限制)
  max_failed_steps: 5          # 限制获取日志的失败步骤数量

Diagnose Parameters

诊断参数

ParameterDefaultDescription
execution_id
--Specific execution to analyze
pipeline_id
--Fetch latest execution for this pipeline
url
--Harness UI URL (auto-extracts IDs)
summary
trueStructured report (true) or raw payload (false)
include_yaml
false (summary) / true (raw)Include pipeline YAML definition
include_logs
false (summary) / true (raw)Include failed step logs
log_snippet_lines
120Max log lines per step (tail). 0 = unlimited
max_failed_steps
5Max steps to fetch logs for. 0 = unlimited
参数默认值描述
execution_id
--要分析的特定执行实例
pipeline_id
--获取该流水线的最新执行实例
url
--Harness UI URL(自动提取ID)
summary
true返回结构化报告(true)或原始负载(false)
include_yaml
false(摘要模式)/ true(原始模式)是否包含流水线YAML定义
include_logs
false(摘要模式)/ true(原始模式)是否包含失败步骤的日志
log_snippet_lines
120每个步骤的最大日志行数(末尾部分)。0表示无限制
max_failed_steps
5最多获取日志的失败步骤数量。0表示无限制

Step 2: Project Health Overview

步骤2:项目健康概览

Check overall project health for context:
Call MCP tool: harness_status
Parameters:
  org_id: "<organization>"
  project_id: "<project>"
Shows recent failed executions, running executions, and deployment activity.
查看项目整体健康状况以获取上下文信息:
Call MCP tool: harness_status
Parameters:
  org_id: "<organization>"
  project_id: "<project>"
展示近期失败的执行实例、正在运行的执行实例以及部署活动。

Step 3: Find Failed Executions (if needed)

步骤3:查找失败的执行实例(如有需要)

Call MCP tool: harness_list
Parameters:
  resource_type: "execution"
  org_id: "<organization>"
  project_id: "<project>"
  search_term: "<pipeline name>"
Call MCP tool: harness_list
Parameters:
  resource_type: "execution"
  org_id: "<organization>"
  project_id: "<project>"
  search_term: "<pipeline name>"

Step 4: Get Execution Details

步骤4:获取执行实例详情

Call MCP tool: harness_get
Parameters:
  resource_type: "execution"
  resource_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"
Call MCP tool: harness_get
Parameters:
  resource_type: "execution"
  resource_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"

Step 5: Get Execution Logs

步骤5:获取执行日志

Call MCP tool: harness_get
Parameters:
  resource_type: "execution_log"
  resource_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"
Call MCP tool: harness_get
Parameters:
  resource_type: "execution_log"
  resource_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"

Step 6: Get Pipeline Definition

步骤6:获取流水线定义

Call MCP tool: harness_get
Parameters:
  resource_type: "pipeline"
  resource_id: "<pipeline_identifier>"
  org_id: "<organization>"
  project_id: "<project>"
Call MCP tool: harness_get
Parameters:
  resource_type: "pipeline"
  resource_id: "<pipeline_identifier>"
  org_id: "<organization>"
  project_id: "<project>"

Analysis Framework

分析框架

Categorize errors and provide targeted fixes:
对错误进行分类并提供针对性修复方案:

Build Failures

构建失败

  • Missing dependencies - Check package.json/requirements.txt
  • Compilation errors - Review recent code changes
  • Docker build failures - Check Dockerfile and base image
  • 依赖缺失 - 检查package.json/requirements.txt
  • 编译错误 - 查看近期代码变更
  • Docker构建失败 - 检查Dockerfile和基础镜像

Infrastructure Errors

基础设施错误

  • "No delegate available" - Check delegate status, verify tags match
  • Connector failures - Rotate credentials, test connection
  • Resource limits - Check cloud quotas and limits
  • "无可用Delegate" - 检查Delegate状态,验证标签匹配
  • 连接器失败 - 轮换凭证,测试连接
  • 资源限制 - 检查云服务配额和限制

Configuration Errors

配置错误

  • "Secret not found" - Verify secret exists at correct scope (account/org/project)
  • "Could not resolve expression" - Check expression syntax
  • "Connector not found" - Verify connectorRef identifier
  • "未找到密钥" - 验证密钥是否存在于正确的作用域(账号/组织/项目)
  • "无法解析表达式" - 检查表达式语法
  • "未找到连接器" - 验证connectorRef标识符

Deployment Errors

部署错误

  • ImagePullBackOff - Check registry credentials and image tag
  • CrashLoopBackOff - Check container logs, resource limits
  • Readiness probe failed - Review probe configuration
  • ImagePullBackOff - 检查镜像仓库凭证和镜像标签
  • CrashLoopBackOff - 检查容器日志、资源限制
  • 就绪探针失败 - 检查探针配置

Timeout Errors

超时错误

  • Step/stage exceeded timeout - Increase timeout or optimize
  • Delegate task queued too long - Scale up delegates
  • 步骤/阶段超时 - 增加超时时间或优化流程
  • Delegate任务排队时间过长 - 扩容Delegate

Artifact Errors

制品错误

  • "Artifact not found" - Verify artifact path, check upstream build
  • "未找到制品" - 验证制品路径,检查上游构建

Response Format

响应格式

undefined
undefined

Pipeline Failure Analysis

Pipeline Failure Analysis

Pipeline: <name> Execution: <id> Failed At: <timestamp>
Pipeline: <name> Execution: <id> Failed At: <timestamp>

Failure Summary

Failure Summary

Stage: <failed_stage> Step: <failed_step> Error: <error message>
Stage: <failed_stage> Step: <failed_step> Error: <error message>

Root Cause

Root Cause

<explanation>
<explanation>

Fix

Fix

Immediate: <specific steps> Prevention: <how to avoid in future>
undefined
Immediate: <specific steps> Prevention: <how to avoid in future>
undefined

Examples

示例

  • "Why did my build pipeline fail?" - Use
    harness_diagnose
    with pipeline_id
  • "Debug execution abc123" - Use
    harness_diagnose
    with execution_id
  • "Show me recent failures" - Use
    harness_status
    then drill into failures
  • "Analyze the pipeline at https://app.harness.io/..." - Pass URL directly to
    harness_diagnose
  • "Which stage is the bottleneck in my pipeline?" - Use
    harness_diagnose
    on a successful execution
  • "Get full logs for the failed deploy step" - Use
    harness_diagnose
    with
    include_logs: true
  • "我的构建流水线为什么失败了?" - 使用带有pipeline_id参数的
    harness_diagnose
    工具
  • "调试执行实例abc123" - 使用带有execution_id参数的
    harness_diagnose
    工具
  • "显示近期失败情况" - 先使用
    harness_status
    工具,再深入排查失败实例
  • "分析https://app.harness.io/...链接中的流水线" - 将URL直接传入
    harness_diagnose
    工具
  • "我的流水线中哪个阶段是瓶颈?" - 对成功的执行实例使用
    harness_diagnose
    工具
  • "获取失败部署步骤的完整日志" - 使用带有
    include_logs: true
    参数的
    harness_diagnose
    工具

Performance Notes

性能注意事项

  • Take your time analyzing logs thoroughly. Read complete error messages and stack traces before diagnosing.
  • Check all failed steps, not just the first one. Multiple failures may share a root cause or reveal a dependency chain.
  • Quality of diagnosis is more important than speed. A wrong diagnosis wastes more time than a thorough one.
  • 仔细分析日志,在诊断前完整阅读错误信息和堆栈跟踪。
  • 检查所有失败步骤,而不仅仅是第一个。多个失败可能共享同一根本原因或揭示依赖链。
  • 诊断质量比速度更重要。错误的诊断比全面的诊断浪费更多时间。

Troubleshooting

故障排除

Logs Not Available

日志不可用

  • Logs expire based on retention settings
  • Very recent executions may have delayed logs
  • Aborted executions may not have complete logs
  • 日志会根据保留设置过期
  • 近期执行的实例可能存在日志延迟
  • 中止的执行实例可能没有完整日志

Cannot Find Execution

无法找到执行实例

  • Verify org/project scope
  • Remove filters to see all executions
  • Check RBAC permissions
  • 验证组织/项目作用域
  • 移除筛选条件以查看所有执行实例
  • 检查RBAC权限

MCP Connection Issues

MCP连接问题

  • Verify MCP server is running and connected
  • Check API key validity
  • Ensure required toolsets (pipelines, logs) are enabled
  • 验证MCP服务器是否正常运行并已连接
  • 检查API密钥有效性
  • 确保已启用所需工具集(流水线、日志等)