agents-best-practices-harness-design

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

agents-best-practices Skill

Skill by ara.so — AI Agent Skills collection.

This skill provides provider-neutral best practices for designing, auditing, and refactoring agentic harnesses—the control plane around a model that validates, authorizes, executes, and observes tool calls. It applies to coding agents, research agents, support, operations, finance, legal, healthcare, education, and workflow automation agents.

Core principle: The model proposes actions; the harness validates, authorizes, executes, records, and returns observations.

由ara.so提供的Skill — AI Agent Skills集合。

本Skill提供与服务商无关的最佳实践，用于设计、审核和重构Agent控制框架——即围绕模型构建的控制层，负责验证、授权、执行和监控工具调用。这些实践适用于编码Agent、研究Agent，以及客服、运维、财务、法律、医疗、教育和工作流自动化等场景下的Agent。

核心原则：模型提出操作建议；控制框架负责验证、授权、执行、记录并返回观测结果。

What This Skill Does

本Skill的功能

Generate MVP agent blueprints for new domains with typed tools, permissions, and launch gates
Audit existing agent harnesses for brittle loops, unbounded tools, missing budgets, and observability gaps
Design tools and permissions with risk-appropriate approval gates
Structure planning mode and goal-like loops with checkpoints and budgets
Build context and memory strategies that preserve active state across compaction
Optimize prompt caching and cost telemetry
Integrate skills, MCP, and external connectors with progressive disclosure
Implement security, evals, and observability for production readiness

生成MVP Agent蓝图：为新领域创建包含类型化工具、权限和发布门槛的最小可行Agent框架
审核现有Agent控制框架：检查是否存在脆弱循环、无边界工具、缺失预算和可观测性缺口等问题
设计工具与权限体系：结合风险等级设置合适的审批关卡
构建规划模式与目标循环：设置检查点和预算，搭建类目标的循环机制
制定上下文与记忆策略：在内容压缩过程中保留活跃状态
优化提示词缓存与成本遥测
集成Skills、MCP与外部连接器：采用渐进式披露方式
实现安全机制、评估与可观测性：确保Agent具备生产环境就绪能力

Installation

安装方法

Option A: Via

skills

CLI (Recommended)

选项A：通过

skills

CLI安装（推荐）

bash

npx skills add DenisSergeevitch/agents-best-practices -g

The

-g

flag installs globally for all projects.

bash

npx skills add DenisSergeevitch/agents-best-practices -g

-g

标识表示全局安装，适用于所有项目。

Option B: Manual Install

选项B：手动安装

For Codex:

bash

mkdir -p "${CODEX_HOME:-$HOME/.codex}/skills"
git clone https://github.com/DenisSergeevitch/agents-best-practices.git \
  "${CODEX_HOME:-$HOME/.codex}/skills/agents-best-practices"

For Claude Code (user-level):

bash

mkdir -p "$HOME/.claude/skills"
git clone https://github.com/DenisSergeevitch/agents-best-practices.git \
  "$HOME/.claude/skills/agents-best-practices"

For Claude Code (project-level):

bash

mkdir -p .claude/skills
git clone https://github.com/DenisSergeevitch/agents-best-practices.git \
  .claude/skills/agents-best-practices

针对Codex：

bash

mkdir -p "${CODEX_HOME:-$HOME/.codex}/skills"
git clone https://github.com/DenisSergeevitch/agents-best-practices.git \
  "${CODEX_HOME:-$HOME/.codex}/skills/agents-best-practices"

针对Claude Code（用户级）：

bash

mkdir -p "$HOME/.claude/skills"
git clone https://github.com/DenisSergeevitch/agents-best-practices.git \
  "$HOME/.claude/skills/agents-best-practices"

针对Claude Code（项目级）：

bash

mkdir -p .claude/skills
git clone https://github.com/DenisSergeevitch/agents-best-practices.git \
  .claude/skills/agents-best-practices

Verification

验证安装

After install, verify the skill is discoverable:

bash

undefined

安装完成后，验证Skill是否可被识别：

bash

undefined

Codex

ls "${CODEX_HOME:-$HOME/.codex}/skills/agents-best-practices"

Claude Code

ls "$HOME/.claude/skills/agents-best-practices"


You should see `SKILL.md`, `README.md`, `icon.jpeg`, and `references/`.

---

ls "$HOME/.claude/skills/agents-best-practices"


你应该能看到`SKILL.md`、`README.md`、`icon.jpeg`和`references/`目录。

---

Repository Structure

仓库结构

agents-best-practices/
├── SKILL.md                                  # skill entrypoint (this file)
├── README.md                                 # public-facing overview
├── icon.jpeg                                 # skill icon
└── references/
    ├── mvp-agent-blueprint.md                # MVP harness generator
    ├── architecture.md                       # component model
    ├── agentic-loop.md                       # loop invariants and budgets
    ├── tools-and-permissions.md              # typed tools and risk classes
    ├── planning-and-goals.md                 # planning mode and long-running goals
    ├── context-memory-compaction.md          # context, memory, retrieval
    ├── prompt-caching-and-cost.md            # cache-aware context layout
    ├── skills-and-connectors.md              # Agent Skills, MCP, connectors
    ├── system-prompts-instructions.md        # instruction hierarchy
    ├── provider-api-patterns.md              # OpenAI, Anthropic, compatible APIs
    ├── security-evals-observability.md       # guardrails, tracing, evals
    ├── agent-legibility-feedback-loops.md    # artifacts and cleanup
    ├── checklists.md                         # implementation and audit checklists
    ├── coverage-audit.md                     # topic coverage verification
    └── source-links.md                       # official references

agents-best-practices/
├── SKILL.md                                  # Skill入口文件（即本文档）
├── README.md                                 # 面向公众的概述文档
├── icon.jpeg                                 # Skill图标
└── references/
    ├── mvp-agent-blueprint.md                # MVP控制框架生成器
    ├── architecture.md                       # 组件模型
    ├── agentic-loop.md                       # 循环不变量与预算
    ├── tools-and-permissions.md              # 类型化工具与风险分类
    ├── planning-and-goals.md                 # 规划模式与长期目标
    ├── context-memory-compaction.md          # 上下文、记忆与检索
    ├── prompt-caching-and-cost.md            # 缓存感知的上下文布局
    ├── skills-and-connectors.md              # Agent Skills、MCP与连接器
    ├── system-prompts-instructions.md        # 指令层级
    ├── provider-api-patterns.md              # OpenAI、Anthropic兼容API
    ├── security-evals-observability.md       # 防护机制、追踪与评估
    ├── agent-legibility-feedback-loops.md    # 工件与清理
    ├── checklists.md                         # 实现与审核清单
    ├── coverage-audit.md                     # 主题覆盖验证
    └── source-links.md                       # 官方参考资料

Core Concepts

核心概念

1. The Agentic Loop

1. Agent循环机制

Every agent follows this pattern:

user/task → context builder → model call → typed tool call
→ schema validation → permission check → execution or pause
→ structured observation → next step or final answer

Key invariants:

Every tool call gets a result (success, denial, timeout, malformed, abort)
Risk changes the loop (reads vs. drafts vs. writes vs. external communications)
Budgets prevent runaway loops (steps, time, tokens, cost, tool calls)
Active state survives compaction

Reference:

references/agentic-loop.md

每个Agent都遵循以下模式：

用户/任务 → 上下文构建器 → 模型调用 → 类型化工具调用
→  schema验证 → 权限检查 → 执行或暂停
→ 结构化观测结果 → 下一步操作或最终答案

关键不变量：

每个工具调用都会返回结果（成功、拒绝、超时、格式错误、中止）
风险等级会改变循环流程（读取、草拟、写入、外部通信等操作的流程不同）
预算机制防止无限循环（步骤、时间、Token、成本、工具调用次数限制）
活跃状态在内容压缩后仍能保留

参考文档：

references/agentic-loop.md

2. Tools and Permissions

2. 工具与权限

Risk classes determine permission requirements:

Risk Class	Examples	Permission
`read_private_data`	Read CRM, support tickets	Autonomous with scope
`draft_external_message`	Draft email, Slack message	Autonomous with label
`write_database`	Update record	Approval gate
`external_communication`	Send email, post to Slack	Approval gate
`destructive_action`	Delete, archive	Approval gate
`privileged_access`	Admin tools, deploy	Approval gate
`financial_operation`	Charge card, transfer funds	Approval gate

Pattern: Typed Tools

typescript

// Good: Narrow, typed, deterministic
interface SendCustomerEmailTool {
  name: "send_customer_email";
  parameters: {
    account_id: string;
    template: "renewal_reminder" | "upgrade_offer" | "support_followup";
    variables: Record<string, string>;
  };
  permission: "approval_gate";
}

// Bad: Generic, untyped, unbounded
interface SendMessageTool {
  name: "send_message";
  parameters: {
    to: string;
    body: string;
  };
}

Reference:

references/tools-and-permissions.md

风险分类决定权限要求：

风险分类	示例	权限要求
`read_private_data`	读取CRM数据、客服工单	自主执行（带范围限制）
`draft_external_message`	草拟邮件、Slack消息	自主执行（带标签标识）
`write_database`	更新记录	需要审批关卡
`external_communication`	发送邮件、在Slack发帖	需要审批关卡
`destructive_action`	删除、归档操作	需要审批关卡
`privileged_access`	管理工具、部署操作	需要审批关卡
`financial_operation`	刷卡收费、转账操作	需要审批关卡

模式：类型化工具

typescript

// 推荐：范围明确、类型化、可预测
interface SendCustomerEmailTool {
  name: "send_customer_email";
  parameters: {
    account_id: string;
    template: "renewal_reminder" | "upgrade_offer" | "support_followup";
    variables: Record<string, string>;
  };
  permission: "approval_gate";
}

// 不推荐：通用化、无类型、无边界
interface SendMessageTool {
  name: "send_message";
  parameters: {
    to: string;
    body: string;
  };
}

参考文档：

references/tools-and-permissions.md

3. Planning and Goals

3. 规划与目标

Planning mode separates thinking from acting:

typescript

interface PlanningResult {
  plan: string;              // What the agent intends to do
  required_approvals: string[];  // Tools needing human approval
  estimated_steps: number;
  estimated_cost_usd: number;
  risk_summary: string;
}

// User approves the plan, then agent executes

Goal-like loops need:

Step budget (e.g., max 20 steps)
Time budget (e.g., 5 minutes)
Cost budget (e.g., $0.50)
Checkpoints (e.g., save state every 5 steps)
Termination reasons (success, budget, validation failure, user abort)

Reference:

references/planning-and-goals.md

规划模式将思考与执行分离：

typescript

interface PlanningResult {
  plan: string;              // Agent的执行计划
  required_approvals: string[];  // 需要人工审批的工具
  estimated_steps: number;
  estimated_cost_usd: number;
  risk_summary: string;
}

// 用户批准计划后，Agent才会执行

类目标循环需要：

步骤预算（例如：最多20步）
时间预算（例如：5分钟）
成本预算（例如：0.50美元）
检查点（例如：每5步保存一次状态）
终止原因（成功、预算耗尽、验证失败、用户中止）

参考文档：

references/planning-and-goals.md

4. Context and Memory

4. 上下文与记忆

Context hierarchy:

typescript

interface AgentContext {
  // Stable, cache-friendly prefix
  system_instructions: string;
  skill_descriptions: string[];
  
  // Active state (outside prompt)
  plan: Plan | null;
  pending_approvals: Approval[];
  todos: Todo[];
  artifacts: Artifact[];
  
  // Recent conversation (compacted)
  messages: Message[];
  
  // Retrieved knowledge
  retrieved_docs: Document[];
}

Compaction rules:

Preserve active state (plan, approvals, todos, artifacts) outside the prompt
Summarize conversation, not decisions
Rehydrate from state, not chat history
Label trust boundaries (user, model, tool, external)

Reference:

references/context-memory-compaction.md

上下文层级：

typescript

interface AgentContext {
  // 稳定、适合缓存的前缀
  system_instructions: string;
  skill_descriptions: string[];
  
  // 活跃状态（存储在提示词外部）
  plan: Plan | null;
  pending_approvals: Approval[];
  todos: Todo[];
  artifacts: Artifact[];
  
  // 近期对话（已压缩）
  messages: Message[];
  
  // 检索到的知识
  retrieved_docs: Document[];
}

压缩规则：

将活跃状态（计划、待审批项、待办事项、工件）存储在提示词外部
仅总结对话内容，不总结决策信息
从状态中恢复上下文，而非从聊天历史中
标记信任边界（用户、模型、工具、外部来源）

参考文档：

references/context-memory-compaction.md

5. Prompt Caching

5. 提示词缓存

Cache-aware layout:

typescript

// Stable prefix (cached)
const systemPrefix = [
  systemInstructions,
  allSkillDescriptions,
  allToolSchemas,
  permanentExamples
];

// Dynamic suffix (not cached)
const dynamicSuffix = [
  retrievedDocs,
  recentMessages,
  currentTask
];

// OpenAI: system, cached_user, user
// Anthropic: system (cached), user (cached), user

Cost telemetry:

typescript

interface ModelCallTelemetry {
  input_tokens: number;
  output_tokens: number;
  cached_tokens: number;
  cost_usd: number;
  cache_hit_rate: number;
}

Reference:

references/prompt-caching-and-cost.md

缓存感知的布局：

typescript

// 稳定前缀（可缓存）
const systemPrefix = [
  systemInstructions,
  allSkillDescriptions,
  allToolSchemas,
  permanentExamples
];

// 动态后缀（不可缓存）
const dynamicSuffix = [
  retrievedDocs,
  recentMessages,
  currentTask
];

// OpenAI：system, cached_user, user
// Anthropic：system（缓存）, user（缓存）, user

成本遥测：

typescript

interface ModelCallTelemetry {
  input_tokens: number;
  output_tokens: number;
  cached_tokens: number;
  cost_usd: number;
  cache_hit_rate: number;
}

参考文档：

references/prompt-caching-and-cost.md

6. Skills and Connectors

6. Skills与连接器

Progressive disclosure:

typescript

// Step 1: Load skill summaries (cached)
const skillIndex = [
  { name: "web-search", description: "Search the web..." },
  { name: "code-analysis", description: "Analyze codebases..." }
];

// Step 2: Load full skill when relevant
if (userNeedsWebSearch) {
  const webSearchSkill = loadSkill("web-search");
  context.push(webSearchSkill.fullInstructions);
}

MCP integration:

typescript

// Expose MCP server as typed tools
const mcpTools = await mcpClient.listTools();
for (const tool of mcpTools) {
  registerTool({
    name: tool.name,
    description: tool.description,
    parameters: tool.inputSchema,
    permission: classifyRisk(tool.name)
  });
}

Reference:

references/skills-and-connectors.md

渐进式披露：

typescript

// 步骤1：加载Skill摘要（可缓存）
const skillIndex = [
  { name: "web-search", description: "Search the web..." },
  { name: "code-analysis", description: "Analyze codebases..." }
];

// 步骤2：仅在需要时加载完整Skill
if (userNeedsWebSearch) {
  const webSearchSkill = loadSkill("web-search");
  context.push(webSearchSkill.fullInstructions);
}

MCP集成：

typescript

// 将MCP服务器暴露为类型化工具
const mcpTools = await mcpClient.listTools();
for (const tool of mcpTools) {
  registerTool({
    name: tool.name,
    description: tool.description,
    parameters: tool.inputSchema,
    permission: classifyRisk(tool.name)
  });
}

参考文档：

references/skills-and-connectors.md

Common Patterns

常见模式

Pattern 1: Generate MVP Agent Blueprint

模式1：生成MVP Agent蓝图

User request:

Build an agent for account renewal risk. It should read CRM, support tickets,
and usage data, then draft renewal actions.

Agent response using this skill:

markdown

undefined

用户请求：

构建一个账户续费风险评估Agent。它需要读取CRM数据、客服工单和使用数据，然后草拟续费相关操作。

使用本Skill的Agent响应：

markdown

undefined

MVP Blueprint: Account Renewal Risk Agent

MVP蓝图：账户续费风险评估Agent

Harness level: Approval-gated Level 2

Core loop: user/task → context builder → model call → typed tool call → schema validation → permission check → execution or pause → structured observation → next step or final brief

Minimal tools:

```
read_account_profile
```
- permission:
```
read_private_data
```
- Input: account_id
- Output: { name, tier, arr, contract_end_date, renewal_probability }
```
list_support_tickets
```
- permission:
```
read_private_data
```
- Input: account_id, days: 90
- Output: Ticket[]
```
fetch_usage_summary
```
- permission:
```
read_private_data
```
- Input: account_id, days: 30
- Output: { active_users, feature_adoption, last_login }
```
draft_customer_email
```
- permission:
```
draft_external_message
```
- Input: account_id, template, variables
- Output: { subject, body, draft_id }
```
request_approval
```
- permission:
```
approval_gate
```
- Input: action, rationale, draft_id
- Output: { approval_id, status: "pending" }

Launch gate:

20 historical accounts processed
Trace review confirms no unapproved external sends
Human acceptance on ≥80% of draft actions
No unbounded tool results or timeout loops

Implementation checklist:

Typed tools with JSON schemas
Permission check before every execution
Step budget (max 15 steps per account)
Cost telemetry and budget ($0.10 per account)
Structured observations with trust labels
Eval: injection, timeout, missing result, budget exhaustion


Reference: `references/mvp-agent-blueprint.md`

---

控制框架等级： 带审批关卡的Level 2

核心循环： 用户/任务 → 上下文构建器 → 模型调用 → 类型化工具调用 → schema验证 → 权限检查 → 执行或暂停 → 结构化观测结果 → 下一步操作或最终简报

最小工具集：

```
read_account_profile
```
- 权限：
```
read_private_data
```
- 输入：account_id
- 输出：{ name, tier, arr, contract_end_date, renewal_probability }
```
list_support_tickets
```
- 权限：
```
read_private_data
```
- 输入：account_id, days: 90
- 输出：Ticket[]
```
fetch_usage_summary
```
- 权限：
```
read_private_data
```
- 输入：account_id, days: 30
- 输出：{ active_users, feature_adoption, last_login }
```
draft_customer_email
```
- 权限：
```
draft_external_message
```
- 输入：account_id, template, variables
- 输出：{ subject, body, draft_id }
```
request_approval
```
- 权限：
```
approval_gate
```
- 输入：action, rationale, draft_id
- 输出：{ approval_id, status: "pending" }

发布门槛：

已处理20个历史账户数据
追踪记录确认无未经批准的外部发送操作
草拟操作的人工验收率≥80%
无无边界工具结果或超时循环

实现清单：

带JSON schema的类型化工具
每次执行前检查权限
步骤预算（每个账户最多15步）
成本遥测与预算（每个账户0.10美元）
带信任标签的结构化观测结果
评估：注入攻击、超时、结果缺失、预算耗尽


参考文档：`references/mvp-agent-blueprint.md`

---

Pattern 2: Audit Existing Agent

模式2：审核现有Agent

User request:

Our research agent sometimes runs tools forever and forgets why it made
a decision after context compaction. Audit the harness.

Agent response using this skill:

markdown

undefined

用户请求：

我们的研究Agent有时会无限循环调用工具，并且在上下文压缩后忘记之前的决策原因。请审核其控制框架。

使用本Skill的Agent响应：

markdown

undefined

Audit: Research Agent Harness

审核报告：研究Agent控制框架

Failure points identified:

No loop budgets (agentic-loop.md)
- Missing: step budget, time budget, cost budget
- Risk: Runaway loops, cost overruns
- Fix: Add
```
max_steps: 30
```
  ,
```
timeout_seconds: 300
```
  ,
```
max_cost_usd: 2.0
```
Compaction loses active state (context-memory-compaction.md)
- Current: Summarizes chat history, drops plan and approvals
- Risk: Agent forgets what it's doing mid-task
- Fix: Store plan, approvals, todos, artifacts outside prompt; rehydrate on load
Tool results unbounded (tools-and-permissions.md)
- Current: Returns full API responses (some 50KB+)
- Risk: Context overflow, cost explosion
- Fix: Add
```
max_result_bytes: 4096
```
  and structured summarization
No event trace (security-evals-observability.md)
- Current: Logs model input/output only
- Risk: Cannot debug tool call → execution → observation flow
- Fix: Add structured trace:
```
{ step, tool, args, result, duration, cost }
```
Trust boundaries unlabeled (context-memory-compaction.md)
- Current: Mixes user input, model output, tool results
- Risk: Injection attacks, confused deputy
- Fix: Label every message:
```
source: "user" | "model" | "tool" | "external"
```

Fix priority:

Add loop budgets and termination reasons
Move active state outside prompt
Bound tool results and add structured observations
Add event trace
Add evals for injection, timeout, budget exhaustion


References: `references/agentic-loop.md`, `references/context-memory-compaction.md`, `references/security-evals-observability.md`

---

已识别的故障点：

无循环预算（agentic-loop.md）
- 缺失：步骤预算、时间预算、成本预算
- 风险：无限循环、成本超支
- 修复方案：添加
```
max_steps: 30
```
  、
```
timeout_seconds: 300
```
  、
```
max_cost_usd: 2.0
```
压缩后丢失活跃状态（context-memory-compaction.md）
- 当前情况：总结聊天历史，丢弃计划和待审批项
- 风险：Agent在任务中途忘记自身执行目标
- 修复方案：将计划、待审批项、待办事项、工件存储在提示词外部；加载时恢复状态
工具结果无边界（tools-and-permissions.md）
- 当前情况：返回完整API响应（部分响应超过50KB）
- 风险：上下文溢出、成本激增
- 修复方案：添加
```
max_result_bytes: 4096
```
  限制，并实现结构化摘要
无事件追踪（security-evals-observability.md）
- 当前情况：仅记录模型输入/输出
- 风险：无法调试工具调用→执行→观测结果的完整流程
- 修复方案：添加结构化追踪记录：
```
{ step, tool, args, result, duration, cost }
```
信任边界未标记（context-memory-compaction.md）
- 当前情况：混合用户输入、模型输出、工具结果
- 风险：注入攻击、权限混淆
- 修复方案：为每条消息标记来源：
```
source: "user" | "model" | "tool" | "external"
```

修复优先级：

添加循环预算与终止原因
将活跃状态移至提示词外部
限制工具结果大小并添加结构化观测
添加事件追踪
针对注入攻击、超时、预算耗尽场景进行评估


参考文档：`references/agentic-loop.md`、`references/context-memory-compaction.md`、`references/security-evals-observability.md`

---

Pattern 3: Design Tools and Permissions

模式3：设计工具与权限体系

User request:

The ops agent needs Slack, Linear, Google Drive, and our internal deploy API.
How should tools and permissions work?

Agent response using this skill:

markdown

undefined

用户请求：

运维Agent需要对接Slack、Linear、Google Drive和我们的内部部署API。工具和权限应该如何设计？

使用本Skill的Agent响应：

markdown

undefined

Tool Design: Ops Agent

工具设计：运维Agent

Risk classification:

Tool	Risk Class	Permission
`search_linear_issues`	`read_private_data`	Autonomous (scoped)
`read_gdrive_doc`	`read_private_data`	Autonomous (scoped)
`draft_slack_message`	`draft_external_message`	Autonomous (labeled)
`post_slack_message`	`external_communication`	Approval gate
`create_linear_issue`	`write_database`	Approval gate
`trigger_deploy`	`privileged_access`	Approval gate

Tool schemas:

typescript

// Good: Narrow, typed
interface PostSlackMessageTool {
  name: "post_slack_message";
  parameters: {
    channel: string;  // Must match allow-list
    message: string;
    thread_ts?: string;
  };
  permission: "approval_gate";
}

// Bad: Generic, unbounded
interface SendMessageTool {
  name: "send_message";
  parameters: {
    platform: string;
    destination: string;
    content: string;
  };
}

Approval flow:

typescript

// Agent proposes
const proposal = {
  tool: "post_slack_message",
  args: { channel: "#incidents", message: "Deploy complete." },
  rationale: "Notify team of successful rollout."
};

// Harness pauses and stores
const approval = await requestApproval(proposal);

// Human reviews in UI
// On approval, harness executes and returns observation
const result = await executeWithApproval(approval.id);

Connector governance:

typescript

// Wrap external APIs as typed tools
class SlackConnector {
  async postMessage(channel: string, message: string): Promise<ToolResult> {
    // Validate channel against allow-list
    if (!this.allowedChannels.includes(channel)) {
      return { status: "denied", reason: "Channel not in allow-list" };
    }
    
    // Execute
    const response = await this.slackClient.chat.postMessage({
      channel,
      text: message
    });
    
    // Return structured observation
    return {
      status: "success",
      data: { ts: response.ts, channel: response.channel },
      metadata: { timestamp: Date.now(), cost_usd: 0 }
    };
  }
}

Reference:

references/tools-and-permissions.md

references/skills-and-connectors.md

风险分类：

工具	风险分类	权限要求
`search_linear_issues`	`read_private_data`	自主执行（带范围限制）
`read_gdrive_doc`	`read_private_data`	自主执行（带范围限制）
`draft_slack_message`	`draft_external_message`	自主执行（带标签标识）
`post_slack_message`	`external_communication`	需要审批关卡
`create_linear_issue`	`write_database`	需要审批关卡
`trigger_deploy`	`privileged_access`	需要审批关卡

工具Schema：

typescript

// 推荐：范围明确、类型化
interface PostSlackMessageTool {
  name: "post_slack_message";
  parameters: {
    channel: string;  // 必须匹配允许列表
    message: string;
    thread_ts?: string;
  };
  permission: "approval_gate";
}

// 不推荐：通用化、无边界
interface SendMessageTool {
  name: "send_message";
  parameters: {
    platform: string;
    destination: string;
    content: string;
  };
}

审批流程：

typescript

// Agent提出操作建议
const proposal = {
  tool: "post_slack_message",
  args: { channel: "#incidents", message: "Deploy complete." },
  rationale: "Notify team of successful rollout."
};

// 控制框架暂停并存储审批请求
const approval = await requestApproval(proposal);

// 人工在UI中审核
// 审批通过后，控制框架执行操作并返回观测结果
const result = await executeWithApproval(approval.id);

连接器治理：

typescript

// 将外部API封装为类型化工具
class SlackConnector {
  async postMessage(channel: string, message: string): Promise<ToolResult> {
    // 验证频道是否在允许列表中
    if (!this.allowedChannels.includes(channel)) {
      return { status: "denied", reason: "Channel not in allow-list" };
    }
    
    // 执行操作
    const response = await this.slackClient.chat.postMessage({
      channel,
      text: message
    });
    
    // 返回结构化观测结果
    return {
      status: "success",
      data: { ts: response.ts, channel: response.channel },
      metadata: { timestamp: Date.now(), cost_usd: 0 }
    };
  }
}

参考文档：

references/tools-and-permissions.md

、

references/skills-and-connectors.md

Pattern 4: Production Readiness Checklist

模式4：生产环境就绪清单

User request:

What should I check before launching this agent?

Agent response using this skill:

markdown

undefined

用户请求：

在发布这个Agent之前，我需要检查哪些内容？

使用本Skill的Agent响应：

markdown

undefined

Production Readiness Checklist

生产环境就绪清单

Runtime:

Loop budgets enforced (step, time, cost, tool calls)
Termination reasons logged (success, budget, validation, abort)
Retries with exponential backoff for transient failures
Active state (plan, approvals, todos, artifacts) outside prompt
Context compaction preserves active state

Tools:

Every tool has JSON schema and docstring
Risk classification for every tool
Permission check before every execution
Tool results bounded (max 4KB per result)
Structured observations with trust labels
No generic tools (execute_anything, send_message, write_database)

Security:

Input validation for all tool arguments
Approval gates for high-risk tools
Injection evals pass (prompt injection, jailbreak, PII leakage)
Sandboxing for code execution tools
Rate limits and abuse detection

Observability:

Structured event trace (step, tool, args, result, duration, cost)
Cost telemetry per task (input tokens, output tokens, cached tokens, USD)
Error categorization (validation, permission, execution, timeout)
Dashboards for cost, latency, success rate, approval rate

Evals:

Historical task replay (≥50 tasks)
Adversarial inputs (injection, confused deputy, unbounded loops)
Edge cases (empty results, timeouts, malformed args, missing approvals)
Human eval on ≥80% of high-risk tool calls

Launch gates:

Shadow mode with human review for 1 week
No unapproved external communications
No cost overruns (≤10% over estimate)
Incident response plan documented


Reference: `references/checklists.md`, `references/security-evals-observability.md`

---

运行时：

已强制执行循环预算（步骤、时间、成本、工具调用次数）
已记录终止原因（成功、预算耗尽、验证失败、用户中止）
针对临时故障实现指数退避重试机制
活跃状态（计划、待审批项、待办事项、工件）存储在提示词外部
上下文压缩时保留活跃状态

工具：

每个工具都有JSON schema和文档字符串
每个工具都已进行风险分类
每次执行前检查权限
工具结果大小受限（每个结果最多4KB）
带信任标签的结构化观测结果
无通用型工具（如execute_anything、send_message、write_database）

安全：

所有工具参数都已进行输入验证
高风险工具已设置审批关卡
注入攻击评估通过（提示词注入、越狱、PII泄露）
代码执行工具已实现沙箱隔离
已设置速率限制和滥用检测机制

可观测性：

结构化事件追踪（步骤、工具、参数、结果、时长、成本）
按任务统计成本遥测（输入Token、输出Token、缓存Token、美元成本）
错误分类（验证错误、权限错误、执行错误、超时）
成本、延迟、成功率、审批率等指标仪表盘

评估：

历史任务重放（≥50个任务）
对抗性输入测试（注入攻击、权限混淆、无限循环）
边缘场景测试（空结果、超时、参数格式错误、缺失审批）
高风险工具调用的人工评估覆盖率≥80%

发布门槛：

已在影子模式下进行1周的人工审核
无未经批准的外部通信
无成本超支（≤预估成本的10%）
已记录事件响应计划


参考文档：`references/checklists.md`、`references/security-evals-observability.md`

---

Provider API Patterns

服务商API模式

OpenAI (Compatible)

OpenAI（兼容）

typescript

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: systemInstructions },
    { role: "user", content: task }
  ],
  tools: tools.map(t => ({
    type: "function",
    function: {
      name: t.name,
      description: t.description,
      parameters: t.parameters
    }
  })),
  tool_choice: "auto"
});

// Handle tool calls
if (response.choices[0].message.tool_calls) {
  for (const toolCall of response.choices[0].message.tool_calls) {
    const result = await executeToolWithPermission(
      toolCall.function.name,
      JSON.parse(toolCall.function.arguments)
    );
    
    messages.push({
      role: "tool",
      tool_call_id: toolCall.id,
      content: JSON.stringify(result)
    });
  }
}

typescript

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: systemInstructions },
    { role: "user", content: task }
  ],
  tools: tools.map(t => ({
    type: "function",
    function: {
      name: t.name,
      description: t.description,
      parameters: t.parameters
    }
  })),
  tool_choice: "auto"
});

// 处理工具调用
if (response.choices[0].message.tool_calls) {
  for (const toolCall of response.choices[0].message.tool_calls) {
    const result = await executeToolWithPermission(
      toolCall.function.name,
      JSON.parse(toolCall.function.arguments)
    );
    
    messages.push({
      role: "tool",
      tool_call_id: toolCall.id,
      content: JSON.stringify(result)
    });
  }
}

Anthropic

typescript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const response = await client.messages.create({
  model: "claude-3-7-sonnet-20250219",
  max_tokens: 4096,
  system: [
    { type: "text", text: systemInstructions, cache_control: { type: "ephemeral" } }
  ],
  messages: [
    { role: "user", content: task }
  ],
  tools: tools.map(t => ({
    name: t.name,
    description: t.description,
    input_schema: t.parameters
  }))
});

// Handle tool calls
if (response.stop_reason === "tool_use") {
  for (const block of response.content) {
    if (block.type === "tool_use") {
      const result = await executeToolWithPermission(block.name, block.input);
      
      messages.push({
        role: "user",
        content: [{
          type: "tool_result",
          tool_use_id: block.id,
          content: JSON.stringify(result)
        }]
      });
    }
  }
}

// Track cache metrics
console.log({
  input_tokens: response.usage.input_tokens,
  cache_read_tokens: response.usage.cache_read_input_tokens,
  cache_creation_tokens: response.usage.cache_creation_input_tokens
});

Reference:

references/provider-api-patterns.md

typescript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const response = await client.messages.create({
  model: "claude-3-7-sonnet-20250219",
  max_tokens: 4096,
  system: [
    { type: "text", text: systemInstructions, cache_control: { type: "ephemeral" } }
  ],
  messages: [
    { role: "user", content: task }
  ],
  tools: tools.map(t => ({
    name: t.name,
    description: t.description,
    input_schema: t.parameters
  }))
});

// 处理工具调用
if (response.stop_reason === "tool_use") {
  for (const block of response.content) {
    if (block.type === "tool_use") {
      const result = await executeToolWithPermission(block.name, block.input);
      
      messages.push({
        role: "user",
        content: [{
          type: "tool_result",
          tool_use_id: block.id,
          content: JSON.stringify(result)
        }]
      });
    }
  }
}

// 追踪缓存指标
console.log({
  input_tokens: response.usage.input_tokens,
  cache_read_tokens: response.usage.cache_read_input_tokens,
  cache_creation_tokens: response.usage.cache_creation_input_tokens
});

参考文档：

references/provider-api-patterns.md

Configuration Example

配置示例

Harness config:

typescript

interface AgentConfig {
  model: {
    provider: "openai" | "anthropic" | "openai-compatible";
    model: string;
    base_url?: string;
    api_key_env: string;
  };
  
  loop: {
    max_steps: number;           // e.g., 30
    timeout_seconds: number;     // e.g., 300
    max_cost_usd: number;        // e.g., 1.0
    max_tool_calls_per_step: number;  // e.g., 5
  };
  
  context: {
    max_messages: number;        // e.g., 50
    compaction_threshold: number; // e.g., 40
    max_tool_result_bytes: number; // e.g., 4096
  };
  
  permissions: {
    auto_approve: string[];      // e.g., ["read_private_data", "draft_external_message"]
    require_approval: string[];  // e.g., ["external_communication", "destructive_action"]
  };
  
  observability: {
    trace_enabled: boolean;
    cost_tracking_enabled: boolean;
    eval_mode: "shadow" | "production";
  };
}

const config: AgentConfig = {
  model: {
    provider: "anthropic",
    model: "claude-3-7-sonnet-20250219",
    api_key_env: "ANTHROPIC_API_KEY"
  },
  loop: {
    max_steps: 30,
    timeout_seconds: 300,
    max_cost_usd: 1.0,
    max_tool_calls_per_step: 5
  },
  context: {
    max_messages: 50,
    compaction_threshold: 40,
    max_tool_result_bytes: 4096
  },
  permissions: {
    auto_approve: ["read_private_data", "draft_external_message"],
    require_approval: ["external_communication", "write_database", "destructive_action"]
  },
  observability: {
    trace_enabled: true,
    cost_tracking_enabled: true,
    eval_mode: "shadow"
  }
};

控制框架配置：

typescript

interface AgentConfig {
  model: {
    provider: "openai" | "anthropic" | "openai-compatible";
    model: string;
    base_url?: string;
    api_key_env: string;
  };
  
  loop: {
    max_steps: number;           // 示例：30
    timeout_seconds: number;     // 示例：300
    max_cost_usd: number;        // 示例：1.0
    max_tool_calls_per_step: number;  // 示例：5
  };
  
  context: {
    max_messages: number;        // 示例：50
    compaction_threshold: number; // 示例：40
    max_tool_result_bytes: number; // 示例：4096
  };
  
  permissions: {
    auto_approve: string[];      // 示例：["read_private_data", "draft_external_message"]
    require_approval: string[];  // 示例：["external_communication", "destructive_action"]
  };
  
  observability: {
    trace_enabled: boolean;
    cost_tracking_enabled: boolean;
    eval_mode: "shadow" | "production";
  };
}

const config: AgentConfig = {
  model: {
    provider: "anthropic",
    model: "claude-3-7-sonnet-20250219",
    api_key_env: "ANTHROPIC_API_KEY"
  },
  loop: {
    max_steps: 30,
    timeout_seconds: 300,
    max_cost_usd: 1.0,
    max_tool_calls_per_step: 5
  },
  context: {
    max_messages: 50,
    compaction_threshold: 40,
    max_tool_result_bytes: 4096
  },
  permissions: {
    auto_approve: ["read_private_data", "draft_external_message"],
    require_approval: ["external_communication", "write_database", "destructive_action"]
  },
  observability: {
    trace_enabled: true,
    cost_tracking_enabled: true,
    eval_mode: "shadow"
  }
};

Troubleshooting

故障排查

Issue: Agent loops forever

问题：Agent无限循环

Symptoms: Agent exceeds step budget or timeout without completing task.

Diagnosis:

Check
```
references/agentic-loop.md
```
for loop invariants
Verify step budget, time budget, and termination conditions are enforced
Review tool results: are they bounded? Are failures properly observed?

Fix:

typescript

// Add hard budgets
if (step >= config.loop.max_steps) {
  return { status: "budget_exhausted", reason: "max_steps" };
}

if (Date.now() - startTime > config.loop.timeout_seconds * 1000) {
  return { status: "timeout", reason: "time_budget" };
}

// Add stop rules
if (allTodosComplete() || userAborted() || criticalError()) {
  return { status: "terminated", reason: ... };
}

症状： Agent超出步骤预算或超时，无法完成任务。

诊断：

查看
```
references/agentic-loop.md
```
中的循环不变量
验证步骤预算、时间预算和终止条件是否已强制执行
检查工具结果：是否有大小限制？失败是否已正确记录为观测结果？

修复方案：

typescript

// 添加硬性预算限制
if (step >= config.loop.max_steps) {
  return { status: "budget_exhausted", reason: "max_steps" };
}

if (Date.now() - startTime > config.loop.timeout_seconds * 1000) {
  return { status: "timeout", reason: "time_budget" };
}

// 添加停止规则
if (allTodosComplete() || userAborted() || criticalError()) {
  return { status: "terminated", reason: ... };
}

Issue: Context compaction loses active work

问题：上下文压缩后丢失活跃工作内容

Symptoms: Agent forgets plan, pending approvals, or todos after compaction.

Diagnosis:

Check
```
references/context-memory-compaction.md
```
Verify active state is stored outside the prompt

Fix:

typescript

interface ActiveState {
  plan: Plan | null;
  pending_approvals: Approval[];
  todos: Todo[];
  artifacts: Artifact[];
}

// Store outside prompt
const state = loadState(taskId);

// Rehydrate after compaction
const context = buildContext({
  system: systemInstructions,
  plan: state.plan,
  todos: state.todos,
  messages: compactedMessages
});

症状： Agent在压缩后忘记计划、待审批项或待办事项。

诊断：

查看
```
references/context-memory-compaction.md
```
验证活跃状态是否存储在提示词外部

修复方案：

typescript

interface ActiveState {
  plan: Plan | null;
  pending_approvals: Approval[];
  todos: Todo[];
  artifacts: Artifact[];
}

// 存储在提示词外部
const state = loadState(taskId);

// 压缩后恢复上下文
const context = buildContext({
  system: systemInstructions,
  plan: state.plan,
  todos: state.todos,
  messages: compactedMessages
});

Issue: Approval gates bypassed

问题：审批关卡被绕过

Symptoms: High-risk tool executed without approval record.

Diagnosis:

Check
```
references/tools-and-permissions.md
```
Verify permission check runs before every execution

Fix:

typescript

async function executeToolWithPermission(tool: string, args: any): Promise<ToolResult> {
  const permission = getToolPermission(tool);
  
  if (permission === "approval_gate") {
    const approval = await findApproval(tool, args);
    if (!approval || approval.status !== "approved") {
      return { status: "denied", reason: "Requires human approval" };
    }
  }
  
  // Execute only after permission check passes
  return await executeTool(tool, args);
}

症状： 高风险工具在无审批记录的情况下被执行。

诊断：

查看
```
references/tools-and-permissions.md
```
验证每次执行前是否都进行了权限检查

修复方案：

typescript

async function executeToolWithPermission(tool: string, args: any): Promise<ToolResult> {
  const permission = getToolPermission(tool);
  
  if (permission === "approval_gate") {
    const approval = await findApproval(tool, args);
    if (!approval || approval.status !== "approved") {
      return { status: "denied", reason: "Requires human approval" };
    }
  }
  
  // 仅在权限检查通过后执行
  return await executeTool(tool, args);
}

Issue: Cost explosion

问题：成本激增

Symptoms: Task costs 10x estimate; cached_tokens = 0.

Diagnosis:

Check
```
references/prompt-caching-and-cost.md
```
Verify stable prefix is cached

Fix:

typescript

// OpenAI: Use system + cached_user pattern
const messages = [
  { role: "system", content: stablePrefix },
  { role: "user", content: cachedKnowledge, cache_control: { type: "ephemeral" } },
  { role: "user", content: dynamicTask }
];

// Anthropic: Cache system blocks
const system = [
  { type: "text", text: stablePrefix, cache_control: { type: "ephemeral" } }
];

// Track cache hit rate
if (cacheHitRate < 0.7) {
  console.warn("Low cache hit rate; review context layout");
}

症状： 任务成本是预估的10倍；cached_tokens = 0。

诊断：

查看
```
references/prompt-caching-and-cost.md
```
验证稳定前缀是否已被缓存

修复方案：

typescript

// OpenAI：使用system + cached_user模式
const messages = [
  { role: "system", content: stablePrefix },
  { role: "user", content: cachedKnowledge, cache_control: { type: "ephemeral" } },
  { role: "user", content: dynamicTask }
];

// Anthropic：缓存system块
const system = [
  { type: "text", text: stablePrefix, cache_control: { type: "ephemeral" } }
];

// 追踪缓存命中率
if (cacheHitRate < 0.7) {
  console.warn("缓存命中率低；请检查上下文布局");
}

Issue: Injection attack

问题：注入攻击

Symptoms: Agent executes unintended tool calls from user input.

Diagnosis:

Check

references/security-evals-observability.md

Verify input validation and trust labels

Fix:

typescript

// Label trust boundaries
const messages = [
  { role: "user", content: userInput, metadata: { source: "user", trusted: false } },
  { role: "assistant", content: modelOutput, metadata: { source: "model" } },
  { role: "tool", content: toolResult, metadata: { source: "tool", trusted: true } }
];

// Validate tool arguments against schema
const valid = validateSchema(tool.parameters, args);
if (!valid) {
  return { status: "validation_failed", errors: valid.errors };
}

// Run injection evals
await runEval("prompt_injection", testCases);

症状： Agent执行了用户输入中包含的非预期工具调用。

诊断：

查看

references/security-evals-observability.md

验证输入验证和信任标签是否已实现

修复方案：

typescript

// 标记信任边界
const messages = [
  { role: "user", content: userInput, metadata: { source: "user", trusted: false } },
  { role: "assistant", content: modelOutput, metadata: { source: "model" } },
  { role: "tool", content: toolResult, metadata: { source: "tool", trusted: true } }
];

// 根据schema验证工具参数
const valid = validateSchema(tool.parameters, args);
if (!valid) {
  return { status: "validation_failed", errors: valid.errors };
}

// 运行注入攻击评估
await runEval("prompt_injection", testCases);

When to Use This Skill

何时使用本Skill

This skill activates when conversations involve:

Agent architecture: harness, loop, runtime, control plane
Tool design: permissions, approvals, typed tools, risk classes
Planning: planning mode, goal loops, checkpoints, budgets
Context: memory, compaction, retrieval, active state
Security: injection, guardrails, evals, sandboxing
Observability: tracing, cost telemetry, launch gates
Connectors: skills, MCP, external APIs, progressive disclosure
Production readiness: checklists, incident response, audits

当对话涉及以下内容时，本Skill会激活：

Agent架构：控制框架、循环机制、运行时、控制层
工具设计：权限、审批、类型化工具、风险分类
规划：规划模式、目标循环、检查点、预算
上下文：记忆、压缩、检索、活跃状态
安全：注入攻击、防护机制、评估、沙箱隔离
可观测性：追踪、成本遥测、发布门槛
连接器：Skills、MCP、外部API、渐进式披露
生产环境就绪：清单、事件响应、审核

Key References

关键参考文档

All detailed references live in

references/

MVP Blueprint:
```
mvp-agent-blueprint.md
```
— domain-specific harness generator
Loop Design:
```
agentic-loop.md
```
— invariants, retries, budgets, stopping
Tools:
```
tools-and-permissions.md
```
— typed tools, risk classes, approvals
Planning:
```
planning-and-goals.md
```
— planning mode, goal loops
Context:
```
context-memory-compaction.md
```
— context, memory, retrieval
Caching:
```
prompt-caching-and-cost.md
```
— cache-aware layout, cost telemetry
Connectors:
```
skills-and-connectors.md
```
— Agent Skills, MCP, progressive disclosure
APIs:
```
provider-api-patterns.md
```
— OpenAI, Anthropic, compatible
Security:
```
security-evals-observability.md
```
— guardrails, tracing, evals
Checklists:
```
checklists.md
```
— implementation and audit checklists

所有详细参考文档都位于

references/

目录中：

MVP蓝图：
```
mvp-agent-blueprint.md
```
— 领域专属控制框架生成器
循环设计：
```
agentic-loop.md
```
— 不变量、重试、预算、停止规则
工具：
```
tools-and-permissions.md
```
— 类型化工具、风险分类、审批
规划：
```
planning-and-goals.md
```
— 规划模式、目标循环
上下文：
```
context-memory-compaction.md
```
— 上下文、记忆、检索
缓存：
```
prompt-caching-and-cost.md
```
— 缓存感知布局、成本遥测
连接器：
```
skills-and-connectors.md
```
— Agent Skills、MCP、渐进式披露
APIs：
```
provider-api-patterns.md
```
— OpenAI、Anthropic兼容API
安全：
```
security-evals-observability.md
```
— 防护机制、追踪、评估
清单：
```
checklists.md
```
— 实现与审核清单

Philosophy Summary

理念总结

The harness acts, not the model — the model proposes; harness validates, authorizes, executes, records
Every tool call gets a result — denial, timeout, malformed, abort are observations too
Risk changes the loop — reads, drafts, writes, external comms, destructive, privileged need different paths
Draft and commit are separate — high-risk side effects require approval records outside prompt
Context is built, not dumped — retrieve just enough, label trust, preserve active state
Long-running work needs budgets — step, time, token, cost, tool-call budgets are product features
Skills and connectors are progressively disclosed — expose names first, load details when relevant
Repeated failures become harness features — validators, tools, docs, evals, policies beat repeating prompt advice

控制框架执行操作，而非模型 — 模型提出建议；控制框架负责验证、授权、执行、记录
每个工具调用都有结果 — 拒绝、超时、格式错误、中止也属于观测结果
风险改变循环流程 — 读取、草拟、写入、外部通信、破坏性操作、特权操作需要不同流程
草拟与提交分离 — 高风险副作用需要在提示词外部记录审批信息
上下文是构建出来的，而非直接堆砌 — 仅检索必要内容、标记信任边界、保留活跃状态
长期运行任务需要预算 — 步骤、时间、Token、成本、工具调用次数预算是产品特性
Skills与连接器采用渐进式披露 — 先暴露名称，在需要时加载详细内容
重复故障转化为控制框架特性 — 验证器、工具、文档、评估、策略比重复的提示词建议更有效

License

许可证

MIT License — see repository for details.

MIT许可证 — 详情请查看仓库文档。

Learn More

了解更多

Repository: github.com/DenisSergeevitch/agents-best-practices
Agent Skills Spec: agentskills.io/specification
Official API docs:
```
references/source-links.md
```

仓库地址：github.com/DenisSergeevitch/agents-best-practices
Agent Skills规范：agentskills.io/specification
官方API文档：
```
references/source-links.md
```

agents-best-practices-harness-design

Original

Translation

agents-best-practices Skill

agents-best-practices Skill

What This Skill Does

本Skill的功能

Installation

安装方法

Option A: Via skills CLI (Recommended)

选项A：通过skills CLI安装（推荐）

Option B: Manual Install

选项B：手动安装

Verification

验证安装

Codex

Codex

Claude Code

Claude Code

Repository Structure

仓库结构

Core Concepts

核心概念

1. The Agentic Loop

1. Agent循环机制

2. Tools and Permissions

2. 工具与权限

3. Planning and Goals

3. 规划与目标

4. Context and Memory

4. 上下文与记忆

5. Prompt Caching

5. 提示词缓存

6. Skills and Connectors

6. Skills与连接器

Common Patterns

常见模式

Pattern 1: Generate MVP Agent Blueprint

模式1：生成MVP Agent蓝图

MVP Blueprint: Account Renewal Risk Agent

MVP蓝图：账户续费风险评估Agent

Pattern 2: Audit Existing Agent

模式2：审核现有Agent

Audit: Research Agent Harness

审核报告：研究Agent控制框架

Pattern 3: Design Tools and Permissions

模式3：设计工具与权限体系

Tool Design: Ops Agent

工具设计：运维Agent

Pattern 4: Production Readiness Checklist

模式4：生产环境就绪清单

Production Readiness Checklist

生产环境就绪清单

Provider API Patterns

服务商API模式

OpenAI (Compatible)

OpenAI（兼容）

Anthropic

Anthropic

Configuration Example

配置示例

Troubleshooting

故障排查

Issue: Agent loops forever

问题：Agent无限循环

Issue: Context compaction loses active work

问题：上下文压缩后丢失活跃工作内容

Issue: Approval gates bypassed

问题：审批关卡被绕过

Issue: Cost explosion

问题：成本激增

Issue: Injection attack

问题：注入攻击

When to Use This Skill

何时使用本Skill

Key References

关键参考文档

Philosophy Summary

理念总结

License

Option A: Via
`skills`
CLI (Recommended)

选项A：通过
`skills`
CLI安装（推荐）