developer-productivity

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Developer Productivity

开发者生产力

Before Starting

开始之前

Check for EM context first. If
.agents/em-context.md
exists, read it.
If
.agents/em-context.md
does not exist, ask for a minimal manager profile first and save it before giving detailed advice: role/title, team size, team mission or ownership area, and current challenge or priority.
If a specific person is central to the conversation and
.agents/reports/[name].md
does not exist, ask for a minimal profile for that person first and save it before giving detailed advice: title/level, tenure, strengths, and current challenge or growth area.

If the conversation reveals durable new context later, update
.agents/em-context.md
or
.agents/reports/[name].md
automatically. Save stable facts and patterns, not guesses, transient frustration, or unresolved interpretations.

首先查看工程经理(EM)相关上下文。若
.agents/em-context.md
文件存在,请先阅读该文件。
.agents/em-context.md
文件不存在,请先收集经理的基础信息并保存,再提供详细建议:职位头衔、团队规模、团队使命或负责领域,以及当前面临的挑战或优先级。
若对话围绕特定个人展开,且
.agents/reports/[name].md
文件不存在,请先收集该人员的基础信息并保存,再提供详细建议:职位级别、任职时长、优势,以及当前面临的挑战或成长方向。

若后续对话出现新的持久化上下文信息,请自动更新
.agents/em-context.md
.agents/reports/[name].md
文件。仅保存稳定的事实和模式,请勿记录猜测内容、临时情绪或未明确的解读。

Response Style

回复风格

Keep the first answer concise and useful. Do not dump the whole framework unless the user asks for depth.
Default to:
  • State the likely diagnosis or recommendation first
  • Ask at most 2-3 targeted questions only if the missing context changes the advice
  • Give the next concrete action and, when useful, exact wording the manager can use
  • Mention the relevant framework briefly, but do not explain every part of it
  • Offer a deeper version only after the direct answer

首次回复需简洁实用。除非用户要求深入了解,否则不要全盘输出整个框架内容。
默认遵循以下原则:
  • 先给出可能的诊断或建议
  • 仅当缺失的上下文会影响建议内容时,提出最多2-3个针对性问题
  • 给出下一步具体行动,必要时提供经理可直接使用的话术
  • 简要提及相关框架,但无需解释框架的每个部分
  • 仅在给出直接回复后,再提供深入版本的内容

How to Use This Skill

如何使用本技能

  • Don't know where to start with engineering metrics → DORA: The Four Key Metrics (start here)
  • Team feels slow but you can't point to data / engineers say they're blocked → DevEx: Three Dimensions
  • Leadership asking how the team is performing → Tying Engineering Metrics to Business Outcomes
  • Being asked to rank or score individual developers → The Problem With Productivity Metrics + What Not to Do
  • Wondering whether surveys and qualitative data count → Qualitative Metrics Are Not Soft

  • 不知道从何入手工程度量 → DORA:四大核心指标(从此处开始)
  • 团队效率低下但找不到数据支撑 / 工程师反馈受阻 → DevEx:三个维度
  • 领导层询问团队绩效 → 工程指标与业务成果的关联
  • 被要求对开发者进行排名或打分 → 生产力度量的问题及规避事项
  • 不确定调研和定性数据是否有效 → 定性指标并非“软性指标”

Default Response Shape

默认回复结构

When helping with productivity, keep the focus on systems, not individual scoring:
  1. Problem framing: what the user is trying to learn or prove.
  2. Metric set: 2-5 team-level signals, mixing delivery, quality, and developer experience.
  3. Interpretation: what each metric can and cannot tell you.
  4. Action loop: how the team will use the data to remove friction.
  5. Anti-pattern warning: what not to measure or communicate.
If leadership wants a single productivity number, explain the risk and offer a small dashboard of complementary signals instead.

在协助解决生产力问题时,需聚焦于系统层面,而非个人评分:
  1. 问题界定: 用户想要了解或证明的内容。
  2. 指标集合: 2-5个团队层面的信号,涵盖交付、质量和开发者体验维度。
  3. 指标解读: 每个指标能说明什么,以及不能说明什么。
  4. 行动闭环: 团队如何利用数据消除阻碍。
  5. 反模式警示: 哪些内容不应度量或传达。
若领导层想要单一的生产力数值,需解释这种做法的风险,并提供包含互补信号的小型仪表盘作为替代方案。

The Problem With Productivity Metrics

生产力度量的问题

Measuring developer productivity is one of the hardest problems in engineering management. The history of attempts illustrates why: every metric that gets adopted gets gamed or misinterpreted.
  • SLOC (lines of code) — incentivizes verbose code, penalizes refactoring
  • Velocity (story points per sprint) — measures effort estimates, not output; easily inflated
  • Cycle time — better, but captures only one dimension of delivery
The underlying issue: software development is a knowledge work discipline. Unlike factory output, it can't be measured by counting things without losing what actually matters.
The wrong use of metrics: measuring individuals. Any metric applied to individual developers creates perverse incentives — people optimize for the metric at the expense of the actual work. Don't rank engineers by PR count, commit frequency, or story points.
The right use of metrics: identifying system-level friction. Good metrics answer "where is the team slowing down, and why?" — not "who is performing well?"

衡量开发者生产力是工程管理中最棘手的问题之一。过往的尝试表明:任何被采用的指标都会被钻空子或误读。
  • SLOC(代码行数)—— 催生冗余代码,惩罚重构行为
  • Velocity(每sprint的故事点数)—— 衡量的是预估工作量,而非产出;容易被夸大
  • Cycle time —— 相对更好,但仅能反映交付的单一维度
核心问题在于:软件开发属于知识工作范畴。与工厂产出不同,单纯通过计数无法衡量其真正价值。
指标的错误用法: 用于衡量个人。任何针对单个开发者的指标都会产生不良激励——人们会为了优化指标而牺牲实际工作质量。请勿通过PR数量、提交频率或故事点数对工程师进行排名。
指标的正确用法: 识别系统层面的阻碍。好的指标能回答“团队在哪里变慢了,原因是什么?”——而非“谁表现好?”

DORA: The Four Key Metrics

DORA:四大核心指标

The most evidence-backed framework for measuring engineering delivery health. Based on research across thousands of organizations, high performers consistently score well on all four:
MetricWhat it measuresHigh performer benchmark
Deployment frequencyHow often you deploy to productionMultiple times per day
Lead time for changesCommit to productionLess than 1 hour
Change failure rate% of deployments causing incidents0–15%
Mean time to recovery (MTTR)How quickly you recover from incidentsLess than 1 hour
These metrics correlate strongly with business outcomes (revenue, customer satisfaction, reliability). They measure the delivery system, not individuals.
How to use them as EM:
  • Baseline your current state. Don't compare to benchmarks yet — just establish your own baseline.
  • Pick the one metric where your team is furthest from high performance. Fix that first.
  • Don't optimize all four simultaneously — that's how you get gaming instead of improvement.

这是衡量工程交付健康度最具实证支撑的框架。基于对数千家企业的研究,高绩效团队在这四个指标上始终表现优异:
指标衡量内容高绩效团队基准
Deployment frequency部署到生产环境的频率每日多次
Lead time for changes从代码提交到上线的时长少于1小时
Change failure rate引发故障的部署占比0–15%
Mean time to recovery (MTTR)故障恢复时长少于1小时
这些指标与业务成果(收入、客户满意度、可靠性)高度相关。它们衡量的是交付系统,而非个人。
工程经理(EM)使用指南:
  • 确定当前状态基准。暂时不要与行业基准对比——先建立自己的基准线。
  • 选择团队与高绩效基准差距最大的一个指标,优先解决该问题。
  • 不要同时优化四个指标——这样只会导致钻空子,而非真正的提升。

DevEx: Three Dimensions of Developer Experience

DevEx:开发者体验的三个维度

The DevEx framework (from DX research) focuses on the developer's lived experience rather than system outputs. It organizes friction into three categories:
Feedback loops — When a developer makes a change, how fast do they know if it worked? This includes CI/CD speed, test run time, code review turnaround, and stakeholder feedback speed. Slow feedback loops break concentration and delay learning.
Cognitive load — How much do developers have to keep in their heads to do their work? Complex processes, unclear ownership, undocumented systems, and context switching all increase cognitive load. High cognitive load slows work and increases errors.
Flow state — Can developers get into deep, uninterrupted focus? Flow state requires: blocks of uninterrupted time, fast tooling, clear goals, and low anxiety. Even good feedback loops and low cognitive load won't produce flow if the environment is fragmented.
How to use it: Run a short team exercise — ask engineers to score each dimension (1–5). The lowest-scoring dimension is your most important focus area. The answers often surface specific, actionable problems (e.g., "our CI takes 45 minutes" or "I never know who owns this service").

DevEx框架(源自DX研究)聚焦于开发者的实际体验,而非系统产出。它将阻碍因素分为三类:
Feedback loops(反馈循环) —— 开发者做出变更后,多久能知道是否有效?这包括CI/CD速度、测试运行时间、代码评审周转时间以及利益相关者的反馈速度。缓慢的反馈循环会打断专注度,延迟学习进程。
Cognitive load(认知负荷) —— 开发者完成工作需要记住多少信息?复杂流程、模糊的职责归属、未文档化的系统以及上下文切换都会增加认知负荷。高认知负荷会减慢工作速度,增加错误率。
Flow state(心流状态) —— 开发者能否进入深度、不间断的专注状态?心流状态需要:不间断的时间段、快速的工具、清晰的目标以及低焦虑感。如果环境碎片化,即便反馈循环良好、认知负荷低,也无法进入心流状态。
使用方法: 开展一次简短的团队活动——让工程师对每个维度打分(1-5分)。得分最低的维度是最需要优先关注的领域。反馈通常会揭示具体、可解决的问题(例如:“我们的CI需要45分钟”或“我永远不知道谁负责这个服务”)。

Qualitative Metrics Are Not Soft

定性指标并非“软性指标”

A common misconception: quantitative metrics are objective and reliable; surveys and qualitative data are fuzzy and unreliable.
This is wrong. Some of the most important productivity signals can only come from humans:
  • How often do you feel blocked waiting for someone else?
  • How confident are you that your work won't break something unexpectedly?
  • How clear is it to you what "good" looks like for your current project?
DORA itself uses surveys for several of its four key metrics — including deployment frequency for organizations that can't measure it automatically. Google's research found that self-reported data is highly reliable when questions are specific and objective.
The practical rule: use quantitative metrics to identify where there's a problem; use qualitative data to understand why. Neither alone gives the full picture.

一个常见误区:定量指标客观可靠;调研和定性数据模糊且不可靠。
这种观点是错误的。一些最重要的生产力信号只能来自人的反馈:
  • 你多久会因等待他人而受阻一次?
  • 你对自己的工作不会意外破坏现有功能有多大信心?
  • 你对当前项目的“优秀”标准有多清晰?
DORA本身就将调研用于其四大核心指标中的多个——包括无法自动度量部署频率的企业。谷歌的研究发现,当问题具体且客观时,自我报告的数据高度可靠。
实用原则:用定量指标定位问题所在;用定性数据理解问题原因。单独使用任何一种都无法获得完整信息。

Tying Engineering Metrics to Business Outcomes

工程指标与业务成果的关联

When leadership asks "how is the engineering team doing?", the answer that lands is the one connected to what they care about.
Common business metrics that engineering directly impacts:
Business metricEngineering connection
GRR / NRR (customer retention)Reliability, quality, user experience
CAC (cost to acquire customers)Feature velocity — shipping faster reduces sales cycle
Time to marketLead time for changes, deployment frequency
Support costChange failure rate, MTTR
A practical translation example: "Our change failure rate dropped from 22% to 8% this quarter. That means fewer incidents, less time in firefighting mode, and fewer support escalations — which directly reduces support cost and improves retention."
The EM's job is to build this translation layer. Engineering metrics don't automatically tell the business story — you have to connect the dots explicitly and repeatedly.

当领导层询问“工程团队表现如何?”时,能打动他们的答案是与他们关心的业务相关的内容。
工程团队直接影响的常见业务指标:
业务指标工程关联
GRR / NRR(客户留存率)可靠性、质量、用户体验
CAC(客户获取成本)Feature velocity(功能交付速度)—— 更快上线可缩短销售周期
Time to market(上市时间)Lead time for changes(变更前置时间)、Deployment frequency(部署频率)
Support cost(支持成本)Change failure rate(变更故障率)、MTTR
一个实用的转化示例:“本季度我们的变更故障率从22%降至8%。这意味着故障更少、救火式工作时间减少、支持升级请求更少——直接降低了支持成本并提升了客户留存率。”
工程经理的职责就是搭建这个转化桥梁。工程指标不会自动讲述业务故事——你必须明确且反复地建立两者之间的关联。

What Not to Do

规避事项

  • Don't use metrics to evaluate individual developers. This destroys trust and optimizes for the metric at the expense of real work.
  • Don't report raw velocity. It measures estimated effort, not output. Leadership will compare across sprints and ask why it dropped, forcing the team to inflate estimates.
  • Don't pick a framework and implement all of it at once. Start with one or two metrics, establish a baseline, and use them to have conversations — not to produce dashboards nobody reads.
  • Don't treat metrics as a substitute for judgment. A team with perfect DORA scores can still be building the wrong thing. Metrics measure delivery health, not direction.

  • 不要用指标评估单个开发者。 这会破坏信任,导致人们为了优化指标而牺牲实际工作。
  • 不要汇报原始Velocity数值。 它衡量的是预估工作量,而非产出。领导层会跨sprint对比并询问下降原因,迫使团队夸大预估。
  • 不要选择一个框架就全盘实施。 从1-2个指标开始,建立基准线,用它们开展对话——而非制作没人看的仪表盘。
  • 不要将指标作为判断的替代品。 即便DORA得分完美的团队,也可能在做错误的事情。指标衡量的是交付健康度,而非方向。

Dive Deeper

深入探索

If the user asks where a framework came from, wants to read the original article, or wants more context on any topic in this skill — read
references/sources.md
for the full list of source articles (with links) and books.

若用户询问框架的来源、想要阅读原文,或想要了解本技能中任何主题的更多背景信息,请查阅
references/sources.md
获取完整的来源文章(含链接)和书籍列表。

Related Skills

相关技能

  • team-health
    — Productivity friction and DevEx signals often surface in team health conversations
  • roadmap-planning
    — Delivery metrics inform capacity planning and deadline discussions
  • meetings
    — Flow state is the DevEx dimension most directly affected by meeting culture
  • team-health
    —— 生产力阻碍和DevEx信号通常会在团队健康对话中显现
  • roadmap-planning
    —— 交付指标可为产能规划和截止日期讨论提供信息
  • meetings
    —— Flow state(心流状态)是DevEx维度中受会议文化影响最直接的部分