optimize-pipeline

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Optimize Pipeline

优化流水线

Analyze and optimize Harness CI/CD pipeline performance through parallel testing, caching, bottleneck analysis, and monorepo strategies.

通过并行测试、缓存、瓶颈分析和monorepo策略，分析并优化Harness CI/CD流水线性能。

Instructions

操作说明

Step 1: Establish Scope

步骤1：确定范围

Confirm the service, pipeline, and current performance baseline.

Call MCP tool: harness_list
Parameters:
  resource_type: "pipeline"
  org_id: "<organization>"
  project_id: "<project>"

Get recent execution timing data:

Call MCP tool: harness_list
Parameters:
  resource_type: "execution"
  org_id: "<organization>"
  project_id: "<project>"
  pipeline_id: "<pipeline_identifier>"

确认服务、流水线以及当前性能基准。

Call MCP tool: harness_list
Parameters:
  resource_type: "pipeline"
  org_id: "<organization>"
  project_id: "<project>"

获取近期执行时间数据：

Call MCP tool: harness_list
Parameters:
  resource_type: "execution"
  org_id: "<organization>"
  project_id: "<project>"
  pipeline_id: "<pipeline_identifier>"

Step 2: Identify the Optimization Task

步骤2：确定优化任务

Determine which optimization the user needs:

Parallel Testing with Test Intelligence -- Split tests across runners and skip unchanged tests
Caching Strategy -- Multi-layer dependency, build output, and test result caching
Pipeline Bottleneck Analysis -- Stage-level timing breakdown with recommendations
Cache Hit Rate Improvement -- Diagnose and fix low cache hit rates
Monorepo CI Pipeline -- Selective builds triggered by changed paths

明确用户需要的优化类型：

基于Test Intelligence的并行测试——在多个运行器间拆分测试，跳过未变更的测试
缓存策略——多层依赖、构建输出和测试结果缓存
流水线瓶颈分析——阶段级时间细分及优化建议
缓存命中率提升——诊断并解决缓存命中率低的问题
Monorepo CI流水线——基于路径变更触发选择性构建

Step 3: Configure Parallel Testing with Test Intelligence

步骤3：配置基于Test Intelligence的并行测试

Gather from the user:

Test framework (JUnit, pytest, Jest, Go test, etc.)
Total test count and current runtime
Target runtime

Design the parallel test strategy:

Split tests across N parallel runners using Harness Test Intelligence
Use TI to identify and skip unchanged tests based on code changes
Configure test splitting method: by class, by file, or by timing data
Set up test result aggregation across runners
Track TI savings over time (tests skipped vs. total)

Configuration:

Enable Test Intelligence in the pipeline stage
Set parallelism level based on test count and runner capacity
Configure test report collection from all parallel runners
Set up failure thresholds (e.g., fail the stage if any runner fails)

向用户收集以下信息：

测试框架（JUnit、pytest、Jest、Go test等）
测试总数及当前运行时长
目标运行时长

设计并行测试策略：

使用Harness Test Intelligence将测试拆分到N个并行运行器中
利用TI根据代码变更识别并跳过未变更的测试
配置测试拆分方式：按类、按文件或按时间数据拆分
设置跨运行器的测试结果聚合
跟踪TI长期节省的时间（跳过的测试数 vs 总测试数）

配置内容：

在流水线阶段启用Test Intelligence
根据测试数量和运行器容量设置并行度
配置从所有并行运行器收集测试报告
设置失败阈值（例如，若任一运行器失败则终止阶段）

Step 4: Design Caching Strategy

步骤4：设计缓存策略

Gather from the user:

Build tool (Maven, Gradle, npm, yarn, pip, Go modules)
Current build time breakdown (dependency download, compile, test)
Cache key source (lockfile hash, manifest hash)

Design multi-layer caching:

Layer 1 -- Dependencies:

Cache key: hash of lockfile (package-lock.json, pom.xml, go.sum)
Cache path: dependency directory (~/.m2, node_modules, ~/.cache/pip)
Fallback: use previous cache if exact match not found

Layer 2 -- Build outputs:

Cache key: hash of source files
Cache path: build output directory (target/, dist/, build/)
Invalidation: any source file change

Layer 3 -- Test results:

Cache key: hash of source + test files
Cache path: test result and coverage directories
Use to skip unchanged tests in combination with TI

Set cache TTL (recommended 7-14 days) with fallback strategy.

向用户收集以下信息：

构建工具（Maven、Gradle、npm、yarn、pip、Go modules等）
当前构建时间细分（依赖下载、编译、测试）
缓存键来源（锁文件哈希、清单哈希）

设计多层缓存：

第一层——依赖缓存：

缓存键：锁文件哈希（package-lock.json、pom.xml、go.sum）
缓存路径：依赖目录（~~/.m2、node_modules、~~/.cache/pip）
回退策略：若未找到完全匹配的缓存，则使用上一个缓存

第二层——构建输出缓存：

缓存键：源文件哈希
缓存路径：构建输出目录（target/、dist/、build/）
失效规则：任何源文件变更时失效

第三层——测试结果缓存：

缓存键：源文件+测试文件的哈希
缓存路径：测试结果和覆盖率目录
结合TI使用，跳过未变更的测试

设置缓存TTL（推荐7-14天）及回退策略。

Step 5: Analyze Pipeline Bottlenecks

步骤5：分析流水线瓶颈

Pull execution data and break down by stage:

Call MCP tool: harness_get
Parameters:
  resource_type: "execution"
  resource_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"

For each stage, identify:

Duration vs. pipeline total (find the longest stages)
Queue time vs. execution time (runner availability issues)
Sequential stages that could run in parallel
Steps that download large artifacts repeatedly

Produce a prioritized optimization list ranked by time savings.

拉取执行数据并按阶段细分：

Call MCP tool: harness_get
Parameters:
  resource_type: "execution"
  resource_id: "<execution_id>"
  org_id: "<organization>"
  project_id: "<project>"

针对每个阶段，识别：

阶段时长占流水线总时长的比例（找出耗时最长的阶段）
排队时间 vs 执行时间（运行器可用性问题）
可并行运行的串行阶段
重复下载大型制品的步骤

生成按时间节省优先级排序的优化列表。

Step 6: Design Monorepo CI Pipeline

步骤6：设计Monorepo CI流水线

Gather from the user:

Number of services and their directories
Shared libraries and their dependents
Build tool

Design selective builds:

Use path-based triggers: only build services with changed files
Build shared libraries when they change, then rebuild all dependents
Run integration tests only when cross-service dependencies change
Use a dependency graph to determine the minimal build set

向用户收集以下信息：

服务数量及其目录
共享库及其依赖项
构建工具

设计选择性构建：

使用基于路径的触发器：仅构建文件发生变更的服务
共享库变更时重新构建，并触发所有依赖该库的服务重新构建
仅当跨服务依赖变更时运行集成测试
使用依赖图确定最小构建集合

Examples

示例

"My pipeline takes 45 minutes, help me speed it up" -- Analyze bottlenecks and recommend parallel testing, caching, and stage reordering
"Set up parallel tests with Test Intelligence" -- Configure TI with test splitting across N runners
"Improve our cache hit rate" -- Diagnose cache key configuration and fix common misses
"Design a CI pipeline for our monorepo with 5 services" -- Configure path-based triggers with selective builds
"Our builds download dependencies every time" -- Design multi-layer caching strategy with fallback

“我的流水线耗时45分钟，帮我提速”——分析瓶颈并推荐并行测试、缓存和阶段重排
“设置基于Test Intelligence的并行测试”——配置TI并将测试拆分到N个运行器中
“提升我们的缓存命中率”——诊断缓存键配置并解决常见的缓存未命中问题
“为我们包含5个服务的monorepo设计CI流水线”——配置基于路径的触发器和选择性构建
“我们的构建每次都要下载依赖”——设计带回退策略的多层缓存策略

Performance Notes

性能注意事项

Test Intelligence needs 2-3 full test runs to build its initial model -- first runs will execute all tests.
Cache keys should be based on lockfiles, not timestamps -- timestamps cause unnecessary cache misses.
Parallelism beyond runner capacity causes queuing -- profile available runner capacity before increasing parallelism.
Monorepo path triggers should include shared library directories to avoid missing transitive changes.

Test Intelligence需要2-3次完整测试运行来构建初始模型——首次运行会执行所有测试。
缓存键应基于锁文件，而非时间戳——时间戳会导致不必要的缓存未命中。
并行度超过运行器容量会导致排队——提升并行度前请分析可用的运行器容量。
Monorepo路径触发器应包含共享库目录，以避免遗漏传递性变更。

Troubleshooting

故障排除

Test Intelligence Not Skipping Tests

Test Intelligence未跳过测试

Verify TI is enabled and has completed baseline runs
Check that the test framework is supported by Harness TI
Ensure test report format is correctly configured for the framework

验证TI已启用且已完成基准运行
检查Harness TI是否支持该测试框架
确保测试报告格式已针对框架正确配置

Cache Misses on Every Build

每次构建都出现缓存未命中

Check cache key configuration -- keys should use file hashes, not timestamps
Verify cache path matches the actual dependency directory
Check that the cache storage backend is accessible from all runners

检查缓存键配置——键应使用文件哈希，而非时间戳
验证缓存路径与实际依赖目录匹配
检查缓存存储后端是否可被所有运行器访问

Monorepo Building Everything

Monorepo构建所有内容

Verify path-based triggers are configured correctly
Check that the dependency graph includes shared library paths
Ensure glob patterns in triggers match the actual directory structure

验证基于路径的触发器配置正确
检查依赖图是否包含共享库路径
确保触发器中的glob模式与实际目录结构匹配