optimize-pipeline
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseOptimize Pipeline
优化流水线
Analyze and optimize Harness CI/CD pipeline performance through parallel testing, caching, bottleneck analysis, and monorepo strategies.
通过并行测试、缓存、瓶颈分析和monorepo策略,分析并优化Harness CI/CD流水线性能。
Instructions
操作说明
Step 1: Establish Scope
步骤1:确定范围
Confirm the service, pipeline, and current performance baseline.
Call MCP tool: harness_list
Parameters:
resource_type: "pipeline"
org_id: "<organization>"
project_id: "<project>"Get recent execution timing data:
Call MCP tool: harness_list
Parameters:
resource_type: "execution"
org_id: "<organization>"
project_id: "<project>"
pipeline_id: "<pipeline_identifier>"确认服务、流水线以及当前性能基准。
Call MCP tool: harness_list
Parameters:
resource_type: "pipeline"
org_id: "<organization>"
project_id: "<project>"获取近期执行时间数据:
Call MCP tool: harness_list
Parameters:
resource_type: "execution"
org_id: "<organization>"
project_id: "<project>"
pipeline_id: "<pipeline_identifier>"Step 2: Identify the Optimization Task
步骤2:确定优化任务
Determine which optimization the user needs:
- Parallel Testing with Test Intelligence -- Split tests across runners and skip unchanged tests
- Caching Strategy -- Multi-layer dependency, build output, and test result caching
- Pipeline Bottleneck Analysis -- Stage-level timing breakdown with recommendations
- Cache Hit Rate Improvement -- Diagnose and fix low cache hit rates
- Monorepo CI Pipeline -- Selective builds triggered by changed paths
明确用户需要的优化类型:
- 基于Test Intelligence的并行测试——在多个运行器间拆分测试,跳过未变更的测试
- 缓存策略——多层依赖、构建输出和测试结果缓存
- 流水线瓶颈分析——阶段级时间细分及优化建议
- 缓存命中率提升——诊断并解决缓存命中率低的问题
- Monorepo CI流水线——基于路径变更触发选择性构建
Step 3: Configure Parallel Testing with Test Intelligence
步骤3:配置基于Test Intelligence的并行测试
Gather from the user:
- Test framework (JUnit, pytest, Jest, Go test, etc.)
- Total test count and current runtime
- Target runtime
Design the parallel test strategy:
- Split tests across N parallel runners using Harness Test Intelligence
- Use TI to identify and skip unchanged tests based on code changes
- Configure test splitting method: by class, by file, or by timing data
- Set up test result aggregation across runners
- Track TI savings over time (tests skipped vs. total)
Configuration:
- Enable Test Intelligence in the pipeline stage
- Set parallelism level based on test count and runner capacity
- Configure test report collection from all parallel runners
- Set up failure thresholds (e.g., fail the stage if any runner fails)
向用户收集以下信息:
- 测试框架(JUnit、pytest、Jest、Go test等)
- 测试总数及当前运行时长
- 目标运行时长
设计并行测试策略:
- 使用Harness Test Intelligence将测试拆分到N个并行运行器中
- 利用TI根据代码变更识别并跳过未变更的测试
- 配置测试拆分方式:按类、按文件或按时间数据拆分
- 设置跨运行器的测试结果聚合
- 跟踪TI长期节省的时间(跳过的测试数 vs 总测试数)
配置内容:
- 在流水线阶段启用Test Intelligence
- 根据测试数量和运行器容量设置并行度
- 配置从所有并行运行器收集测试报告
- 设置失败阈值(例如,若任一运行器失败则终止阶段)
Step 4: Design Caching Strategy
步骤4:设计缓存策略
Gather from the user:
- Build tool (Maven, Gradle, npm, yarn, pip, Go modules)
- Current build time breakdown (dependency download, compile, test)
- Cache key source (lockfile hash, manifest hash)
Design multi-layer caching:
Layer 1 -- Dependencies:
- Cache key: hash of lockfile (package-lock.json, pom.xml, go.sum)
- Cache path: dependency directory (~/.m2, node_modules, ~/.cache/pip)
- Fallback: use previous cache if exact match not found
Layer 2 -- Build outputs:
- Cache key: hash of source files
- Cache path: build output directory (target/, dist/, build/)
- Invalidation: any source file change
Layer 3 -- Test results:
- Cache key: hash of source + test files
- Cache path: test result and coverage directories
- Use to skip unchanged tests in combination with TI
Set cache TTL (recommended 7-14 days) with fallback strategy.
向用户收集以下信息:
- 构建工具(Maven、Gradle、npm、yarn、pip、Go modules等)
- 当前构建时间细分(依赖下载、编译、测试)
- 缓存键来源(锁文件哈希、清单哈希)
设计多层缓存:
第一层——依赖缓存:
- 缓存键:锁文件哈希(package-lock.json、pom.xml、go.sum)
- 缓存路径:依赖目录(
/.m2、node_modules、/.cache/pip) - 回退策略:若未找到完全匹配的缓存,则使用上一个缓存
第二层——构建输出缓存:
- 缓存键:源文件哈希
- 缓存路径:构建输出目录(target/、dist/、build/)
- 失效规则:任何源文件变更时失效
第三层——测试结果缓存:
- 缓存键:源文件+测试文件的哈希
- 缓存路径:测试结果和覆盖率目录
- 结合TI使用,跳过未变更的测试
设置缓存TTL(推荐7-14天)及回退策略。
Step 5: Analyze Pipeline Bottlenecks
步骤5:分析流水线瓶颈
Pull execution data and break down by stage:
Call MCP tool: harness_get
Parameters:
resource_type: "execution"
resource_id: "<execution_id>"
org_id: "<organization>"
project_id: "<project>"For each stage, identify:
- Duration vs. pipeline total (find the longest stages)
- Queue time vs. execution time (runner availability issues)
- Sequential stages that could run in parallel
- Steps that download large artifacts repeatedly
Produce a prioritized optimization list ranked by time savings.
拉取执行数据并按阶段细分:
Call MCP tool: harness_get
Parameters:
resource_type: "execution"
resource_id: "<execution_id>"
org_id: "<organization>"
project_id: "<project>"针对每个阶段,识别:
- 阶段时长占流水线总时长的比例(找出耗时最长的阶段)
- 排队时间 vs 执行时间(运行器可用性问题)
- 可并行运行的串行阶段
- 重复下载大型制品的步骤
生成按时间节省优先级排序的优化列表。
Step 6: Design Monorepo CI Pipeline
步骤6:设计Monorepo CI流水线
Gather from the user:
- Number of services and their directories
- Shared libraries and their dependents
- Build tool
Design selective builds:
- Use path-based triggers: only build services with changed files
- Build shared libraries when they change, then rebuild all dependents
- Run integration tests only when cross-service dependencies change
- Use a dependency graph to determine the minimal build set
向用户收集以下信息:
- 服务数量及其目录
- 共享库及其依赖项
- 构建工具
设计选择性构建:
- 使用基于路径的触发器:仅构建文件发生变更的服务
- 共享库变更时重新构建,并触发所有依赖该库的服务重新构建
- 仅当跨服务依赖变更时运行集成测试
- 使用依赖图确定最小构建集合
Examples
示例
- "My pipeline takes 45 minutes, help me speed it up" -- Analyze bottlenecks and recommend parallel testing, caching, and stage reordering
- "Set up parallel tests with Test Intelligence" -- Configure TI with test splitting across N runners
- "Improve our cache hit rate" -- Diagnose cache key configuration and fix common misses
- "Design a CI pipeline for our monorepo with 5 services" -- Configure path-based triggers with selective builds
- "Our builds download dependencies every time" -- Design multi-layer caching strategy with fallback
- “我的流水线耗时45分钟,帮我提速”——分析瓶颈并推荐并行测试、缓存和阶段重排
- “设置基于Test Intelligence的并行测试”——配置TI并将测试拆分到N个运行器中
- “提升我们的缓存命中率”——诊断缓存键配置并解决常见的缓存未命中问题
- “为我们包含5个服务的monorepo设计CI流水线”——配置基于路径的触发器和选择性构建
- “我们的构建每次都要下载依赖”——设计带回退策略的多层缓存策略
Performance Notes
性能注意事项
- Test Intelligence needs 2-3 full test runs to build its initial model -- first runs will execute all tests.
- Cache keys should be based on lockfiles, not timestamps -- timestamps cause unnecessary cache misses.
- Parallelism beyond runner capacity causes queuing -- profile available runner capacity before increasing parallelism.
- Monorepo path triggers should include shared library directories to avoid missing transitive changes.
- Test Intelligence需要2-3次完整测试运行来构建初始模型——首次运行会执行所有测试。
- 缓存键应基于锁文件,而非时间戳——时间戳会导致不必要的缓存未命中。
- 并行度超过运行器容量会导致排队——提升并行度前请分析可用的运行器容量。
- Monorepo路径触发器应包含共享库目录,以避免遗漏传递性变更。
Troubleshooting
故障排除
Test Intelligence Not Skipping Tests
Test Intelligence未跳过测试
- Verify TI is enabled and has completed baseline runs
- Check that the test framework is supported by Harness TI
- Ensure test report format is correctly configured for the framework
- 验证TI已启用且已完成基准运行
- 检查Harness TI是否支持该测试框架
- 确保测试报告格式已针对框架正确配置
Cache Misses on Every Build
每次构建都出现缓存未命中
- Check cache key configuration -- keys should use file hashes, not timestamps
- Verify cache path matches the actual dependency directory
- Check that the cache storage backend is accessible from all runners
- 检查缓存键配置——键应使用文件哈希,而非时间戳
- 验证缓存路径与实际依赖目录匹配
- 检查缓存存储后端是否可被所有运行器访问
Monorepo Building Everything
Monorepo构建所有内容
- Verify path-based triggers are configured correctly
- Check that the dependency graph includes shared library paths
- Ensure glob patterns in triggers match the actual directory structure
- 验证基于路径的触发器配置正确
- 检查依赖图是否包含共享库路径
- 确保触发器中的glob模式与实际目录结构匹配