measure-okr-grader

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

OKR Grader

An OKR Cycle Review is a backward-looking artifact that closes the loop on a completed OKR set. It scores each KR against its baseline and target, separates committed from aspirational interpretation, surfaces what evidence does and does not support, names what the team learned, and prepares input for next-cycle drafting. Done well, a cycle review protects the integrity of the OKR operating system by refusing to dress up missed commitments as aspirational stretch, refusing to celebrate effort over outcome, and refusing to let scoring carry weight it cannot bear.

This skill is an evidence interpreter, not an arithmetic engine. Its job is to read final KR values, compare them against the original OKR set's intent, and produce a review that names the learning honestly. It enforces the empirical scoring conventions drawn from Doerr (

Measure What Matters

), Wodtke (

Radical Focus

), Castro (committed vs aspirational interpretation), Grove (

High Output Management

), and the OKR community's accumulated practice on misuse failure modes. It pairs with

foundation-okr-writer

(which produced the OKR set being scored) and hands off the learnings produced here to the iterate skills that consume them.

OKR周期评审是一种回顾性文档，用于闭环已完成的OKR集。它会根据基线和目标为每个KR评分，区分committed与aspirational的解读方式，明确哪些证据支持或不支持评分，总结团队学到的内容，并为下一周期的OKR制定提供输入。执行到位的周期评审能够维护OKR操作系统的完整性，避免将未达成的committed目标伪装成aspirational挑战目标，拒绝将付出的努力等同于成果，也不会让评分承担其无法承载的权重。

本技能是证据解读工具，而非算术引擎。它的职责是读取KR的最终值，与原始OKR集的意图进行对比，并生成一份如实反映学习成果的评审报告。它遵循来自Doerr（《Measure What Matters》）、Wodtke（《Radical Focus》）、Castro（committed与aspirational的解读方式）、Grove（《High Output Management》）以及OKR社区总结的关于误用失效模式的实证评分规范。它与生成待评分OKR集的

foundation-okr-writer

配合使用，并将此处生成的学习成果交接给后续的迭代类技能。

When to Use

适用场景

The OKR cycle has ended (or you are scoring a partial-cycle close)
You have final or interim KR values, baselines, and targets
Stakeholders need a clear review with score, evidence, and learning
The team is deciding what to continue, stop, change, or carry forward
There is disagreement about whether a score is good or bad
Evidence quality across KRs is uneven and needs to be made visible

OKR周期已结束（或你需要对周期中途进行阶段性评分）
你拥有KR的最终值或阶段性值、基线、目标、证据来源以及OKR类型（committed | aspirational | learning | operational_health | compliance_or_safety）
利益相关方需要一份包含评分、证据和学习成果的清晰评审报告
团队正在决定哪些工作需要继续、停止、调整或延续到下一周期
对于评分的优劣存在分歧
不同KR的证据质量参差不齐，需要明确呈现

When NOT to Use

不适用场景

You are still drafting OKRs . use
```
/okr-writer
```
You want a generic team retro . use
```
/retrospective
```
You are reporting a single experiment result . use
```
/experiment-results
```
You need a stakeholder progress update without scoring . use
```
/stakeholder-update
```
The OKR set was never agreed on or never tracked . scoring requires an authored set; backfill via
```
/okr-writer
```
first
You want to use scores to evaluate individuals . the skill refuses this

你仍在起草OKR → 使用
```
/okr-writer
```
你需要通用的团队回顾 → 使用
```
/retrospective
```
你需要报告单个实验结果 → 使用
```
/experiment-results
```
你需要不含评分的利益相关方进度更新 → 使用
```
/stakeholder-update
```
OKR集从未达成一致或从未被跟踪 → 评分需要已确定的OKR集；请先通过
```
/okr-writer
```
补全
你想用评分评估个人绩效 → 本技能会拒绝此请求

Instructions

操作步骤

When asked to score completed OKRs, follow these steps:

Validate scoring readiness Check inputs: original OKR set, cycle dates, final KR values (or interim values for partial-close), baselines, targets, evidence sources, and OKR types (committed | aspirational | learning | operational_health | compliance_or_safety). If a value is missing, mark it explicitly (
```
not-yet-observable
```
,
```
not-instrumented
```
,
```
not-supplied
```
); never fabricate. Refuse to grade KRs whose original definitions are missing entirely.
Classify each KR's type and indicator class The OKR type is one of
```
committed | aspirational | learning | operational_health | compliance_or_safety
```
(the five values produced by
```
foundation-okr-writer
```
). The indicator class is one of
```
leading | lagging | guardrail | health | evidence_generation
```
. Carry both forward from the original OKR set, or assign defaults if the original set did not specify. The OKR type determines the scoring convention:
```
aspirational
```
uses the 0.6 to 0.7 sweet spot;
```
committed
```
targets 1.0;
```
compliance_or_safety
```
is binary;
```
operational_health
```
is pass | fail | drift-within-tolerance against a threshold band;
```
learning
```
grades by validated or invalidated rather than by score. The indicator class adds independent rules that apply on top of the type's scoring (see Step 3).
Score each KR For each KR, compute or assign a score using the convention for its OKR type:
- ```
aspirational
```
  KR: numeric score = (actual - baseline) / (target - baseline). Sweet spot is 0.6 to 0.7.
- ```
committed
```
  KR: pass or fail against the target. Anything below 1.0 is a miss.
- ```
compliance_or_safety
```
  KR: binary. Met or not met. No partial credit. No retroactive scope shrinkage when coverage is partial; mark as not-yet-fully-observable instead.
- ```
operational_health
```
  KR: pass | fail | drift-within-tolerance against the threshold band.
- ```
learning
```
  KR: validated, invalidated, partially-validated, or insufficient-evidence. No numeric score. Then apply indicator-class rules independently of the OKR type:
- any KR with indicator class
```
guardrail
```
  is reported as its own signal and is NEVER averaged into the primary objective score, regardless of its OKR type. A failed guardrail does not dilute a high primary KR score. For each score, state the calculation or rationale and the evidence confidence (high | medium | low | unknown).
Interpret the objective score Avoid naive averaging when one KR is a guardrail, compliance threshold, or learning KR. Produce a qualitative read of the objective alongside any rough numeric average. State explicitly what the score does and does not mean.
Assess evidence quality For each KR, name the evidence's reliability and any caveats (instrumentation gaps, target shifts mid-cycle, cohort definition changes, measurement window mismatches, sample-size limitations). Recommend fixes for next cycle's measurement plan.
Review initiatives as bets For each initiative the team ran, name which KR it was expected to move, whether it shipped, what its apparent contribution was, and whether the evidence supports continuing, retiring, or reworking it. Use Castro's "initiatives are bets, not commitments" framing. Separate ship-status from KR-impact; an initiative that shipped on time but did not move its KR is not a partial win.
Synthesize learning Capture validated assumptions, invalidated assumptions, surprises, and decision implications. Distinguish between learnings about the customer or product (carry forward), learnings about team process (hand to
```
/retrospective
```
), and learnings about measurement (hand to
```
/instrumentation-spec
```
or
```
/dashboard-requirements
```
).
Prepare next-cycle recommendations For each objective: continue, revise, retire, or escalate. Suggest candidate next-cycle OKRs or open questions for
```
/okr-writer
```
. Hand-off measurement gaps to
```
/dashboard-requirements
```
or
```
/instrumentation-spec
```
. Hand-off assumption tests to
```
/hypothesis
```
. Hand-off team-process work to
```
/retrospective
```
. Hand-off organizational memory to
```
/lessons-log
```
. Hand-off next-cycle drafting to
```
/okr-writer
```
.
Surface risks in interpretation Make explicit any places the score could mislead a reader: forced numeric scores on KRs that are not yet observable, confounded initiative results, stakeholder framings that under-state evidence, single-cycle results that need a second cycle of confirmation.
Note the source of truth The artifact is a review document, not the canonical OKR system. Include a
```
source_of_truth
```
field pointing to the original OKR tracker.
Finalize for direct use Remove all skill instruction commentary from the final artifact. The final output should be reader-facing.

当要求对已完成的OKR进行评分时，请遵循以下步骤：

验证评分准备情况 检查输入内容：原始OKR集、周期日期、KR的最终值（或阶段性评分的阶段性值）、基线、目标、证据来源以及OKR类型（committed | aspirational | learning | operational_health | compliance_or_safety）。若有值缺失，请明确标记（
```
not-yet-observable
```
、
```
not-instrumented
```
、
```
not-supplied
```
）；绝不能编造数据。拒绝为完全缺失原始定义的KR评分。
对每个KR的类型和指标类别进行分类 OKR类型为
```
committed | aspirational | learning | operational_health | compliance_or_safety
```
中的一种（由
```
foundation-okr-writer
```
生成的五个值）。指标类别为
```
leading | lagging | guardrail | health | evidence_generation
```
中的一种。从原始OKR集中沿用这两个属性，若原始集未指定则分配默认值。OKR类型决定评分规范：
```
aspirational
```
使用0.6至0.7的最佳区间；
```
committed
```
目标为1.0；
```
compliance_or_safety
```
为二元制；
```
operational_health
```
根据阈值区间判定通过 | 未通过 | 允许范围内波动；
```
learning
```
根据假设是否验证通过评分，而非数值分数。指标类别会在类型评分规则之上添加独立规则（见步骤3）。
为每个KR评分 针对每个KR，根据其OKR类型的规范计算或分配评分：
- ```
aspirational
```
  KR：数值评分 = (实际值 - 基线) / (目标值 - 基线)，范围为0到1。最佳区间为0.6至0.7。
- ```
committed
```
  KR：根据目标判定通过或未通过。任何低于1.0的结果均为未达成。
- ```
compliance_or_safety
```
  KR：二元制，达成或未达成，无部分得分。当覆盖范围不完整时，不得追溯缩小范围；应标记为
```
not-yet-fully-observable
```
  。
- ```
operational_health
```
  KR：根据阈值区间判定通过 | 未通过 | 允许范围内波动。
- ```
learning
```
  KR：验证通过、验证未通过、部分验证通过或证据不足，无数值评分。然后独立于OKR类型应用指标类别规则：
- 任何指标类别为
```
guardrail
```
  的KR都作为独立信号报告，绝不纳入主目标评分的平均值计算，无论其OKR类型如何。未通过的guardrail不会拉低主KR的高评分。对于每个评分，说明计算方式或理由，以及证据可信度（高 | 中 | 低 | 未知）。
解读目标评分 当存在guardrail KR、合规阈值或learning KR时，避免简单的平均值计算。在给出大致数值平均值的同时，生成对目标的定性解读。明确说明评分的含义和局限性。
评估证据质量 针对每个KR，说明证据的可靠性及任何注意事项（工具缺口、周期中途目标变更、群组定义变更、测量窗口不匹配、样本量限制）。为下一周期的测量计划提出修复建议。
将举措视为赌注进行评审 针对团队执行的每项举措，说明其预期影响的KR、是否交付、实际贡献，以及证据是否支持继续、终止或调整该举措。采用Castro的“举措是赌注，而非承诺”框架。区分交付状态与KR影响：按时交付但未影响KR的举措并非部分成功。
总结学习成果 记录已验证的假设、未验证的假设、意外发现以及决策影响。区分关于客户或产品的学习成果（延续到下一周期）、关于团队流程的学习成果（交接给
```
/retrospective
```
）以及关于测量的学习成果（交接给
```
/instrumentation-spec
```
或
```
/dashboard-requirements
```
）。
制定下一周期建议 针对每个目标：继续、修订、终止或升级。为
```
/okr-writer
```
建议候选的下一周期OKR或开放性问题。将测量缺口交接给
```
/dashboard-requirements
```
或
```
/instrumentation-spec
```
。将假设测试交接给
```
/hypothesis
```
。将团队流程相关工作交接给
```
/retrospective
```
。将组织记忆交接给
```
/lessons-log
```
。将下一周期起草工作交接给
```
/okr-writer
```
。
呈现解读中的风险 明确指出评分可能误导读者的地方：对尚未可观测的KR强制给出数值评分、举措结果混淆、利益相关方的表述低估证据、单个周期结果需要第二个周期确认等。
注明事实来源 本文档是评审报告，而非标准OKR系统。包含一个
```
source_of_truth
```
字段，指向原始OKR追踪工具。
最终定稿以供直接使用 从最终文档中移除所有技能操作说明内容。最终输出应面向读者。

Constraint Rules (MUST / MUST NOT)

约束规则（必须遵守/严禁）

These rules are non-negotiable. The skill enforces them in every grading run.

MUST NOT retroactively change baselines, targets, or KR definitions. If the team adjusted these mid-cycle, document the change explicitly and grade against both the original and adjusted versions.
MUST NOT retroactively shrink the scope of a
```
committed
```
or
```
compliance_or_safety
```
KR to mark partial coverage as a pass. If the original commitment named 3 healthcare accounts and only 1 has been audited, the KR is
```
not-yet-fully-observable
```
. The 1-account result is a sub-signal, not the KR score.
MUST NOT treat 0.7 as success for
```
committed
```
,
```
compliance_or_safety
```
, or
```
operational_health
```
KRs. Those target 1.0 (or the threshold band).
MUST NOT average away a failed guardrail. A failed guardrail is a separate signal that does not get diluted by the primary KR's success.
MUST NOT equate effort with impact. Initiatives that shipped on time but failed to move their KR are not partial wins.
MUST NOT use OKR scores as individual performance ratings or compensation inputs. If the user requests this, refuse and explain the sandbagging and learning-suppression risks.
MUST NOT punish honest stretch when aspirational intent was explicit and disclosed at OKR-writing time. A 0.6 aspirational score is the designed sweet spot.
MUST NOT celebrate missed committed goals as ambitious failure. Committed misses are misses.
MUST mark any not-yet-observable KR explicitly (e.g., a 90-day retention cohort whose window extends past cycle close). Forced numeric scores on not-yet-observable KRs are misleading.
MUST include evidence confidence on every KR score (high | medium | low | unknown).
MUST NOT become the canonical source of truth. Always include a
```
source_of_truth
```
pointer to the user's actual OKR tracker.

这些规则不可协商，本技能在每次评分时都会强制执行。

严禁追溯修改基线、目标或KR定义。若团队在周期中途调整了这些内容，请明确记录变更，并同时根据原始版本和调整后的版本进行评分。
严禁追溯缩小
```
committed
```
或
```
compliance_or_safety
```
KR的范围，将部分覆盖标记为通过。若原始承诺为3个医疗账户，而仅1个已审核，则该KR标记为
```
not-yet-fully-observable
```
。1个账户的结果是子信号，而非KR评分。
严禁将0.7视为
```
committed
```
、
```
compliance_or_safety
```
或
```
operational_health
```
KR的成功标准。这些类型的目标为1.0（或阈值区间）。
严禁将未通过的guardrail平均化。未通过的guardrail是独立信号，不会被主KR的成功稀释。
严禁将付出的努力等同于影响。按时交付但未影响KR的举措并非部分成功。
严禁将OKR评分用作个人绩效评级或薪酬依据。若用户提出此请求，应拒绝并解释其带来的保守主义和抑制学习的风险。
严禁在OKR起草阶段已明确披露aspirational意图的情况下，惩罚诚实的挑战目标。0.6的aspirational评分是设计好的最佳区间。
严禁将未达成的committed目标标榜为有抱负的失败。未达成committed目标就是未达成。
必须明确标记任何尚未可观测的KR（例如，90天留存群组的观察窗口超出周期结束时间）。对尚未可观测的KR强制给出数值评分具有误导性。
必须为每个KR评分标注证据可信度（高 | 中 | 低 | 未知）。
严禁成为标准事实来源。始终包含指向用户实际OKR追踪工具的
```
source_of_truth
```
链接。

Scoring Rules

评分规则

The skill applies these conventions to every cycle review. The convention follows the OKR type, not the team's preference at grading time. OKR type and indicator class are independent dimensions; type controls scoring, indicator class adds reporting rules.

OKR types determine the scoring convention:

aspirational
: numeric score on a 0 to 1 scale = (actual - baseline) / (target - baseline). Sweet spot is 0.6 to 0.7. Below 0.4 is a miss; above 0.8 over multiple cycles suggests sandbagged targets needing recalibration.
committed
: pass or fail against the target. Anything below 1.0 is a miss requiring postmortem. Do not soften with aspirational interpretation.
compliance_or_safety
: binary. Met or not met. No partial credit. No retroactive scope shrinkage. If the committed scope is only partially observable (some audits pending, some accounts deferred), mark the KR as
```
not-yet-fully-observable
```
; the observed subset is a sub-signal, not the KR score.
operational_health
: pass | fail | drift-within-tolerance against the threshold band.
learning
: validated | invalidated | partially-validated | insufficient-evidence. No numeric score.

Indicator class rules apply on top of the OKR type's scoring:

indicator class
guardrail
: the KR is scored per its OKR type, and additionally is reported as its own signal, never averaged into the primary objective score. A failed guardrail does not dilute a high primary KR score, regardless of whether the guardrail itself is
```
committed
```
,
```
aspirational
```
,
```
operational_health
```
, or
```
compliance_or_safety
```
.

Special states:

not-yet-observable
: score deferred. Do not force a numeric score; mark interim signal and projected score with explicit confidence and the date the final score becomes available.
not-yet-fully-observable
: a
```
committed
```
or
```
compliance_or_safety
```
KR with partial coverage. Score the KR as deferred until full coverage is observable. Do NOT promote a sub-signal to a KR-level pass.

本技能在每次周期评审中都会应用以下规范。规范遵循OKR类型，而非评分时团队的偏好。OKR类型和指标类别是独立维度；类型控制评分规则，指标类别添加报告规则。

OKR类型决定评分规范：

aspirational
：0到1的数值评分 = (实际值 - 基线) / (目标值 - 基线)。最佳区间为0.6至0.7。低于0.4为未达成；多个周期评分高于0.8表明目标过于保守，需要重新校准。
committed
：根据目标判定通过或未通过。任何低于1.0的结果均为未达成，需要进行事后分析。不得用aspirational的解读方式弱化结果。
compliance_or_safety
：二元制，达成或未达成，无部分得分。不得追溯缩小范围。若承诺范围仅部分可观测（部分审核待完成、部分账户延迟），则标记该KR为
```
not-yet-fully-observable
```
；观测到的子集是子信号，而非KR评分。
operational_health
：根据阈值区间判定通过 | 未通过 | 允许范围内波动。
learning
：验证通过 | 验证未通过 | 部分验证通过 | 证据不足，无数值评分。

指标类别规则在OKR类型评分规则之上应用：

指标类别
guardrail
：KR根据其OKR类型评分，同时作为独立信号报告，绝不纳入主目标评分的平均值计算。无论guardrail本身是
```
committed
```
、
```
aspirational
```
、
```
operational_health
```
还是
```
compliance_or_safety
```
类型，未通过的guardrail都不会拉低主KR的高评分。

特殊状态：

not-yet-observable
：评分延迟。不得强制给出数值评分；标记阶段性信号和预估评分，并明确标注可信度和最终评分可用的日期。
not-yet-fully-observable
：
```
committed
```
或
```
compliance_or_safety
```
类型的KR仅部分覆盖。延迟评分，直到完全覆盖可观测。不得将子信号提升为KR级别的通过。

Anti-Patterns the Skill Detects

本技能检测的反模式

The skill scans for these and either flags or refuses:

Retroactive target adjustment (we hit it because we changed the target) . document the change; grade against both definitions
Retroactive scope shrinkage on a
```
committed
```
or
```
compliance_or_safety
```
KR (committed to 3 healthcare audits, 1 audit completed, scored as "pass on in-scope") . refuse and mark not-yet-fully-observable
Average-the-guardrail-away (a failed guardrail dissolved into a high primary score) . separate the guardrail signal
Aspirational-grading-of-committed (treating 0.7 as success on a committed KR) . refuse and explain
Effort-equals-impact (initiative shipped, score did not move, scored as partial win) . separate ship-status from KR-impact
Compensation coupling (using the score for performance reviews) . refuse and explain
Missed-committed-as-stretch (we did not quite hit the contractual deadline but the team really tried) . refuse the framing
Sandbagged target (consistently scoring above 0.85 on aspirational targets) . flag for next-cycle target recalibration
Forced score on not-yet-observable (giving a numeric score to a KR whose 90-day window has not closed) . mark deferred
Initiative-as-cause-without-evidence (claiming Initiative X drove KR Y when timing or instrumentation cannot support it) . separate apparent contribution from causal claim
Hidden low-confidence (precise numeric scores with weak evidence) . surface confidence; do not let precision mask uncertainty
Stakeholder narrative override (a leader's preferred framing taking precedence over the evidence) . the grader's read is independent of stakeholder framing
Single-cycle confirmation (treating one cycle's signal as proof) . recommend a second cycle when the evidence is suggestive but not robust

本技能会扫描以下反模式，并进行标记或拒绝：

追溯调整目标（因为我们修改了目标所以达成了）→ 记录变更；根据两个版本评分
追溯缩小
```
committed
```
或
```
compliance_or_safety
```
KR的范围（承诺3次医疗审核，完成1次，评分标注为“覆盖范围内通过”）→ 拒绝并标记为
```
not-yet-fully-observable
```
将guardrail平均化（未通过的guardrail被主KR的高评分掩盖）→ 分离guardrail信号
用aspirational标准为committed KR评分（将0.7视为committed KR的成功）→ 拒绝并解释原因
将努力等同于影响（举措已交付，评分未变化，标注为部分成功）→ 分离交付状态与KR影响
与薪酬挂钩（将评分用于绩效评审）→ 拒绝并解释原因
将未达成的committed目标视为挑战目标（我们没完全达到合同截止日期，但团队确实努力了）→ 拒绝该表述
目标过于保守（aspirational目标持续评分高于0.85）→ 标记为下一周期需要重新校准目标
对尚未可观测的KR强制评分（为观察窗口未结束的90天KR给出数值评分）→ 标记为延迟
无证据将举措视为原因（声称举措X推动了KR Y，但时间或工具无法支持该结论）→ 区分实际贡献与因果主张
隐藏低可信度（精确的数值评分但证据薄弱）→ 呈现可信度；不让精确性掩盖不确定性
利益相关方叙事覆盖（领导者偏好的表述优先于证据）→ 评分结果独立于利益相关方表述
单周期确认（将单个周期的信号视为定论）→ 当证据有提示性但不充分时，建议进行第二个周期

Output Contract (v1.0.0)

输出协议（v1.0.0）

All required sections present in canonical order: Summary, Scorecard, Objective Interpretation, Evidence Quality, Initiative Review, Learning, Next-cycle Recommendations, Risks in Interpretation
Every KR in the Scorecard includes: actual value (or
```
not-yet-observable
```
/
```
not-yet-fully-observable
```
marker), score using the type-appropriate convention, evidence confidence, interpretation
```
aspirational
```
KRs use the 0 to 1 numeric scale;
```
committed
```
KRs are pass or fail;
```
compliance_or_safety
```
KRs are binary;
```
operational_health
```
KRs are pass | fail | drift-within-tolerance;
```
learning
```
KRs use validated or invalidated language
KRs with indicator class
```
guardrail
```
are surfaced separately and never averaged into the primary objective score, regardless of OKR type

Partial-coverage on a

committed

compliance_or_safety

KR is marked

not-yet-fully-observable

, not

pass-on-in-scope

Source-of-truth note is present and points to a non-skill location
Hand-off section names specific downstream skills for learnings, team-process work, assumption tests, and measurement gaps
Markdown only output. No JSON.
Measure phase classification:
```
phase: measure
```
in frontmatter; no
```
classification:
```
field

所有必填部分按标准顺序呈现：摘要、评分卡、目标解读、证据质量、举措评审、学习成果、下一周期建议、解读风险
评分卡中的每个KR包含：实际值（或
```
not-yet-observable
```
/
```
not-yet-fully-observable
```
标记）、符合类型规范的评分、证据可信度、解读
```
aspirational
```
KR使用0到1的数值评分；
```
committed
```
KR为通过或未通过；
```
compliance_or_safety
```
KR为二元制；
```
operational_health
```
KR为通过 | 未通过 | 允许范围内波动；
```
learning
```
KR使用验证通过或未通过的表述
指标类别为
```
guardrail
```
的KR单独呈现，绝不纳入主目标评分的平均值计算，无论OKR类型如何
```
committed
```
或
```
compliance_or_safety
```
类型KR的部分覆盖标记为
```
not-yet-fully-observable
```
，而非“覆盖范围内通过”
包含事实来源说明，指向非本技能的位置
交接部分明确列出接收学习成果、团队流程工作、假设测试和测量缺口的下游技能
仅输出Markdown格式，无JSON
测量阶段分类：前置内容中包含
```
phase: measure
```
；无
```
classification:
```
字段

Quality Checklist

质量检查清单

Before finalizing, verify:

Every KR has a final value, an explicit
```
not-yet-observable
```
marker, or an explicit
```
not-yet-fully-observable
```
marker (for partial-coverage on
```
committed
```
or
```
compliance_or_safety
```
KRs)
Every KR has an evidence confidence rating

Every KR's score uses the convention for its OKR type from the canonical enum:

committed | aspirational | learning | operational_health | compliance_or_safety

```
guardrail
```
is treated as indicator class, not as an OKR type
KRs with indicator class
```
guardrail
```
are surfaced separately and never averaged into the primary score
No retroactive target changes are silently absorbed

No retroactive scope shrinkage on

committed

compliance_or_safety

KRs (partial coverage is

not-yet-fully-observable

, not

pass-on-in-scope

)

No committed KR is graded as aspirational
No effort-equals-impact framing on initiatives
No compensation-coupled framing
Risks-in-interpretation section names where the score could mislead a reader
Hand-off section names specific downstream skills with rationale
Source-of-truth note present
Skill instruction commentary removed from final artifact
Markdown only . no JSON output

定稿前，请验证：

每个KR都有最终值、明确的
```
not-yet-observable
```
标记，或明确的
```
not-yet-fully-observable
```
标记（针对
```
committed
```
或
```
compliance_or_safety
```
类型KR的部分覆盖情况）
每个KR都有证据可信度评级

每个KR的评分符合标准枚举OKR类型的规范：

committed | aspirational | learning | operational_health | compliance_or_safety

```
guardrail
```
被视为指标类别，而非OKR类型
指标类别为
```
guardrail
```
的KR单独呈现，绝不纳入主评分的平均值计算
无追溯目标变更被默默忽略
```
committed
```
或
```
compliance_or_safety
```
类型KR无追溯范围缩小（部分覆盖标记为
```
not-yet-fully-observable
```
，而非“覆盖范围内通过”）
无committed KR被按aspirational标准评分
无将努力等同于影响的举措表述
无与薪酬挂钩的表述
解读风险部分列出评分可能误导读者的情况
交接部分明确列出下游技能及理由
包含事实来源说明
最终文档中已移除技能操作说明内容
仅输出Markdown格式，无JSON

Examples

示例

See

references/EXAMPLE.md

for a completed cycle review in the storevine sample thread (Campaigns team, Q3 2026 close), demonstrating aspirational scoring with one KR not-yet-observable, a held guardrail, and a templates-as-retention-driver thesis invalidation. The companion

foundation-okr-writer

skill produces the OKR sets this skill scores; together they cover the full quarterly arc.

请查看

references/EXAMPLE.md

，获取Storevine示例线程（营销团队，2026年第三季度结束）中的完整周期评审示例，展示了aspirational评分、一个尚未可观测的KR、一个独立的guardrail，以及“模板提升留存率”假设的验证未通过。配套的

foundation-okr-writer

技能生成本技能评分的OKR集；两者共同覆盖完整的季度流程。