optimize-pipeline

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Optimize Pipeline

优化流水线

Analyze and optimize Harness CI/CD pipeline performance through parallel testing, caching, bottleneck analysis, and monorepo strategies.
通过并行测试、缓存、瓶颈分析和monorepo策略,分析并优化Harness CI/CD流水线性能。

Instructions

操作说明

Step 1: Establish Scope

步骤1:确定范围

Confirm the service, pipeline, and current performance baseline.
Call MCP tool: harness_list
Parameters:
  resource_type: "pipeline"
  org_id: "<organization>"
  project_id: "<project>"
Get recent execution timing data:
Call MCP tool: harness_list
Parameters:
  resource_type: "execution"
  org_id: "<organization>"
  project_id: "<project>"
  pipeline_id: "<pipeline_identifier>"
确认服务、流水线以及当前性能基准。
Call MCP tool: harness_list
Parameters:
  resource_type: "pipeline"
  org_id: "<organization>"
  project_id: "<project>"
获取近期执行时间数据:
Call MCP tool: harness_list
Parameters:
  resource_type: "execution"
  org_id: "<organization>"
  project_id: "<project>"
  pipeline_id: "<pipeline_identifier>"

Step 2: Identify the Optimization Task

步骤2:确定优化任务

Determine which optimization the user needs:
  1. Parallel Testing with Test Intelligence -- Split tests across runners and skip unchanged tests
  2. Caching Strategy -- Multi-layer dependency, build output, and test result caching
  3. Pipeline Bottleneck Analysis -- Stage-level timing breakdown with recommendations
  4. Cache Hit Rate Improvement -- Diagnose and fix low cache hit rates
  5. Monorepo CI Pipeline -- Selective builds triggered by changed paths
明确用户需要的优化类型:
  1. 基于Test Intelligence的并行测试——在多个运行器间拆分测试,跳过未变更的测试
  2. 缓存策略——多层依赖、构建输出和测试结果缓存
  3. 流水线瓶颈分析——阶段级时间细分及优化建议
  4. 缓存命中率提升——诊断并解决缓存命中率低的问题
  5. Monorepo CI流水线——基于路径变更触发选择性构建

Step 3: Configure Parallel Testing with Test Intelligence

步骤3:配置基于Test Intelligence的并行测试

Gather from the user:
  • Test framework (JUnit, pytest, Jest, Go test, etc.)
  • Total test count and current runtime
  • Target runtime
Design the parallel test strategy:
  • Split tests across N parallel runners using Harness Test Intelligence
  • Use TI to identify and skip unchanged tests based on code changes
  • Configure test splitting method: by class, by file, or by timing data
  • Set up test result aggregation across runners
  • Track TI savings over time (tests skipped vs. total)
Configuration:
  • Enable Test Intelligence in the pipeline stage
  • Set parallelism level based on test count and runner capacity
  • Configure test report collection from all parallel runners
  • Set up failure thresholds (e.g., fail the stage if any runner fails)
向用户收集以下信息:
  • 测试框架(JUnit、pytest、Jest、Go test等)
  • 测试总数及当前运行时长
  • 目标运行时长
设计并行测试策略:
  • 使用Harness Test Intelligence将测试拆分到N个并行运行器中
  • 利用TI根据代码变更识别并跳过未变更的测试
  • 配置测试拆分方式:按类、按文件或按时间数据拆分
  • 设置跨运行器的测试结果聚合
  • 跟踪TI长期节省的时间(跳过的测试数 vs 总测试数)
配置内容:
  • 在流水线阶段启用Test Intelligence
  • 根据测试数量和运行器容量设置并行度
  • 配置从所有并行运行器收集测试报告
  • 设置失败阈值(例如,若任一运行器失败则终止阶段)

Step 4: Design Caching Strategy

步骤4:设计缓存策略

Gather from the user:
  • Build tool (Maven, Gradle, npm, yarn, pip, Go modules)
  • Current build time breakdown (dependency download, compile, test)
  • Cache key source (lockfile hash, manifest hash)
Design multi-layer caching:
Layer 1 -- Dependencies:
  • Cache key: hash of lockfile (package-lock.json, pom.xml, go.sum)
  • Cache path: dependency directory (~/.m2, node_modules, ~/.cache/pip)
  • Fallback: use previous cache if exact match not found
Layer 2 -- Build outputs:
  • Cache key: hash of source files
  • Cache path: build output directory (target/, dist/, build/)
  • Invalidation: any source file change
Layer 3 -- Test results:
  • Cache key: hash of source + test files
  • Cache path: test result and coverage directories
  • Use to skip unchanged tests in combination with TI
Set cache TTL (recommended 7-14 days) with fallback strategy.
向用户收集以下信息:
  • 构建工具(Maven、Gradle、npm、yarn、pip、Go modules等)
  • 当前构建时间细分(依赖下载、编译、测试)
  • 缓存键来源(锁文件哈希、清单哈希)
设计多层缓存:
第一层——依赖缓存:
  • 缓存键:锁文件哈希(package-lock.json、pom.xml、go.sum)
  • 缓存路径:依赖目录(/.m2、node_modules、/.cache/pip)
  • 回退策略:若未找到完全匹配的缓存,则使用上一个缓存
第二层——构建输出缓存:
  • 缓存键:源文件哈希
  • 缓存路径:构建输出目录(target/、dist/、build/)
  • 失效规则:任何源文件变更时失效
第三层——测试结果缓存:
  • 缓存键:源文件+测试文件的哈希
  • 缓存路径:测试结果和覆盖率目录
  • 结合TI使用,跳过未变更的测试
设置缓存TTL(推荐7-14天)及回退策略。

Step 5: Analyze Pipeline Bottlenecks

步骤5:分析流水线瓶颈

Pull execution data and break down by stage:
Call MCP tool: harness_get
Parameters:
  resource_type: "execution"
  resource_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"
For each stage, identify:
  • Duration vs. pipeline total (find the longest stages)
  • Queue time vs. execution time (runner availability issues)
  • Sequential stages that could run in parallel
  • Steps that download large artifacts repeatedly
Produce a prioritized optimization list ranked by time savings.
拉取执行数据并按阶段细分:
Call MCP tool: harness_get
Parameters:
  resource_type: "execution"
  resource_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"
针对每个阶段,识别:
  • 阶段时长占流水线总时长的比例(找出耗时最长的阶段)
  • 排队时间 vs 执行时间(运行器可用性问题)
  • 可并行运行的串行阶段
  • 重复下载大型制品的步骤
生成按时间节省优先级排序的优化列表。

Step 6: Design Monorepo CI Pipeline

步骤6:设计Monorepo CI流水线

Gather from the user:
  • Number of services and their directories
  • Shared libraries and their dependents
  • Build tool
Design selective builds:
  • Use path-based triggers: only build services with changed files
  • Build shared libraries when they change, then rebuild all dependents
  • Run integration tests only when cross-service dependencies change
  • Use a dependency graph to determine the minimal build set
向用户收集以下信息:
  • 服务数量及其目录
  • 共享库及其依赖项
  • 构建工具
设计选择性构建:
  • 使用基于路径的触发器:仅构建文件发生变更的服务
  • 共享库变更时重新构建,并触发所有依赖该库的服务重新构建
  • 仅当跨服务依赖变更时运行集成测试
  • 使用依赖图确定最小构建集合

Examples

示例

  • "My pipeline takes 45 minutes, help me speed it up" -- Analyze bottlenecks and recommend parallel testing, caching, and stage reordering
  • "Set up parallel tests with Test Intelligence" -- Configure TI with test splitting across N runners
  • "Improve our cache hit rate" -- Diagnose cache key configuration and fix common misses
  • "Design a CI pipeline for our monorepo with 5 services" -- Configure path-based triggers with selective builds
  • "Our builds download dependencies every time" -- Design multi-layer caching strategy with fallback
  • “我的流水线耗时45分钟,帮我提速”——分析瓶颈并推荐并行测试、缓存和阶段重排
  • “设置基于Test Intelligence的并行测试”——配置TI并将测试拆分到N个运行器中
  • “提升我们的缓存命中率”——诊断缓存键配置并解决常见的缓存未命中问题
  • “为我们包含5个服务的monorepo设计CI流水线”——配置基于路径的触发器和选择性构建
  • “我们的构建每次都要下载依赖”——设计带回退策略的多层缓存策略

Performance Notes

性能注意事项

  • Test Intelligence needs 2-3 full test runs to build its initial model -- first runs will execute all tests.
  • Cache keys should be based on lockfiles, not timestamps -- timestamps cause unnecessary cache misses.
  • Parallelism beyond runner capacity causes queuing -- profile available runner capacity before increasing parallelism.
  • Monorepo path triggers should include shared library directories to avoid missing transitive changes.
  • Test Intelligence需要2-3次完整测试运行来构建初始模型——首次运行会执行所有测试。
  • 缓存键应基于锁文件,而非时间戳——时间戳会导致不必要的缓存未命中。
  • 并行度超过运行器容量会导致排队——提升并行度前请分析可用的运行器容量。
  • Monorepo路径触发器应包含共享库目录,以避免遗漏传递性变更。

Troubleshooting

故障排除

Test Intelligence Not Skipping Tests

Test Intelligence未跳过测试

  • Verify TI is enabled and has completed baseline runs
  • Check that the test framework is supported by Harness TI
  • Ensure test report format is correctly configured for the framework
  • 验证TI已启用且已完成基准运行
  • 检查Harness TI是否支持该测试框架
  • 确保测试报告格式已针对框架正确配置

Cache Misses on Every Build

每次构建都出现缓存未命中

  • Check cache key configuration -- keys should use file hashes, not timestamps
  • Verify cache path matches the actual dependency directory
  • Check that the cache storage backend is accessible from all runners
  • 检查缓存键配置——键应使用文件哈希,而非时间戳
  • 验证缓存路径与实际依赖目录匹配
  • 检查缓存存储后端是否可被所有运行器访问

Monorepo Building Everything

Monorepo构建所有内容

  • Verify path-based triggers are configured correctly
  • Check that the dependency graph includes shared library paths
  • Ensure glob patterns in triggers match the actual directory structure
  • 验证基于路径的触发器配置正确
  • 检查依赖图是否包含共享库路径
  • 确保触发器中的glob模式与实际目录结构匹配