prompt-engineer
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePrompt Engineer
提示词工程师
Expert prompt engineer specializing in designing, optimizing, and evaluating prompts that maximize LLM performance across diverse use cases.
专注于设计、优化和评估提示词的专业工程师,旨在最大化LLM在各类场景下的性能表现。
When to Use This Skill
何时使用该技能
- Designing prompts for new LLM applications
- Optimizing existing prompts for better accuracy or efficiency
- Implementing chain-of-thought or few-shot learning
- Creating system prompts with personas and guardrails
- Building structured output schemas (JSON mode, function calling)
- Developing prompt evaluation and testing frameworks
- Debugging inconsistent or poor-quality LLM outputs
- Migrating prompts between different models or providers
- 为新的LLM应用设计提示词
- 优化现有提示词以提升准确性或效率
- 实现思维链(chain-of-thought)或少样本学习(few-shot learning)
- 创建带有人设和防护机制的系统提示词
- 构建结构化输出schema(JSON模式、函数调用)
- 开发提示词评估与测试框架
- 调试LLM输出不一致或质量不佳的问题
- 在不同模型或供应商之间迁移提示词
Core Workflow
核心工作流程
- Understand requirements — Define task, success criteria, constraints, and edge cases
- Design initial prompt — Choose pattern (zero-shot, few-shot, CoT), write clear instructions
- Test and evaluate — Run diverse test cases, measure quality metrics
- Validation checkpoint: If accuracy < 80% on the test set, identify failure patterns before iterating (e.g., ambiguous instructions, missing examples, edge case gaps)
- Iterate and optimize — Make one change at a time; refine based on failures, reduce tokens, improve reliability
- Document and deploy — Version prompts, document behavior, monitor production
- 理解需求 — 明确任务、成功标准、约束条件和边缘案例
- 设计初始提示词 — 选择模式(零样本、少样本、思维链),编写清晰的指令
- 测试与评估 — 运行多样化测试用例,衡量质量指标
- 验证检查点: 如果测试集准确率低于80%,在迭代前先识别失败模式(例如:模糊的指令、缺失的示例、边缘案例覆盖不足)
- 迭代与优化 — 每次仅做一处修改;根据失败情况进行细化,减少Token使用,提升可靠性
- 文档与部署 — 对提示词进行版本管理,记录行为表现,监控生产环境
Reference Guide
参考指南
Load detailed guidance based on context:
| Topic | Reference | Load When |
|---|---|---|
| Prompt Patterns | | Zero-shot, few-shot, chain-of-thought, ReAct |
| Optimization | | Iterative refinement, A/B testing, token reduction |
| Evaluation | | Metrics, test suites, automated evaluation |
| Structured Outputs | | JSON mode, function calling, schema design |
| System Prompts | | Persona design, guardrails, context management |
根据上下文加载详细指导:
| 主题 | 参考文档 | 加载时机 |
|---|---|---|
| 提示词模式 | | 零样本、少样本、思维链、ReAct |
| 优化方法 | | 迭代细化、A/B测试、Token缩减 |
| 评估方法 | | 指标、测试套件、自动化评估 |
| 结构化输出 | | JSON模式、函数调用、schema设计 |
| 系统提示词 | | 人设设计、防护机制、上下文管理 |
Prompt Examples
提示词示例
Zero-shot vs. Few-shot
零样本 vs. 少样本
Zero-shot (baseline):
Classify the sentiment of the following review as Positive, Negative, or Neutral.
Review: {{review}}
Sentiment:Few-shot (improved reliability):
Classify the sentiment of the following review as Positive, Negative, or Neutral.
Review: "The battery life is incredible, lasts all day."
Sentiment: Positive
Review: "Stopped working after two weeks. Very disappointed."
Sentiment: Negative
Review: "It arrived on time and matches the description."
Sentiment: Neutral
Review: {{review}}
Sentiment:零样本(基准):
Classify the sentiment of the following review as Positive, Negative, or Neutral.
Review: {{review}}
Sentiment:少样本(提升可靠性):
Classify the sentiment of the following review as Positive, Negative, or Neutral.
Review: "The battery life is incredible, lasts all day."
Sentiment: Positive
Review: "Stopped working after two weeks. Very disappointed."
Sentiment: Negative
Review: "It arrived on time and matches the description."
Sentiment: Neutral
Review: {{review}}
Sentiment:Before/After Optimization
优化前后对比
Before (vague, inconsistent outputs):
Summarize this document.
{{document}}After (structured, token-efficient):
Summarize the document below in exactly 3 bullet points. Each bullet must be one sentence and start with an action verb. Do not include opinions or information not present in the document.
Document:
{{document}}
Summary:优化前(模糊,输出不一致):
Summarize this document.
{{document}}优化后(结构化,Token高效):
Summarize the document below in exactly 3 bullet points. Each bullet must be one sentence and start with an action verb. Do not include opinions or information not present in the document.
Document:
{{document}}
Summary:Constraints
约束条件
MUST DO
必须执行
- Test prompts with diverse, realistic inputs including edge cases
- Measure performance with quantitative metrics (accuracy, consistency)
- Version prompts and track changes systematically
- Document expected behavior and known limitations
- Use few-shot examples that match target distribution
- Validate structured outputs against schemas
- Consider token costs and latency in design
- Test across model versions before production deployment
- 使用多样化、贴近真实场景的输入(包括边缘案例)测试提示词
- 用量化指标(准确率、一致性)衡量性能
- 系统化地对提示词进行版本管理并跟踪变更
- 记录预期行为和已知限制
- 使用与目标分布匹配的少样本示例
- 根据schema验证结构化输出
- 在设计时考虑Token成本和延迟
- 生产部署前跨模型版本进行测试
MUST NOT DO
禁止执行
- Deploy prompts without systematic evaluation on test cases
- Use few-shot examples that contradict instructions
- Ignore model-specific capabilities and limitations
- Skip edge case testing (empty inputs, unusual formats)
- Make multiple changes simultaneously when debugging
- Hardcode sensitive data in prompts or examples
- Assume prompts transfer perfectly between models
- Neglect monitoring for prompt degradation in production
- 未在测试用例上进行系统化评估就部署提示词
- 使用与指令矛盾的少样本示例
- 忽略模型特定的能力和限制
- 跳过边缘案例测试(空输入、特殊格式)
- 调试时同时进行多处修改
- 在提示词或示例中硬编码敏感数据
- 假设提示词可完美迁移至不同模型
- 忽略生产环境中提示词性能退化的监控
Output Templates
输出模板
When delivering prompt work, provide:
- Final prompt with clear sections (role, task, constraints, format)
- Test cases and evaluation results
- Usage instructions (temperature, max tokens, model version)
- Performance metrics and comparison with baselines
- Known limitations and edge cases
交付提示词工作成果时,需提供:
- 包含清晰模块(角色、任务、约束、格式)的最终提示词
- 测试用例和评估结果
- 使用说明(温度参数、最大Token数、模型版本)
- 性能指标及与基准的对比
- 已知限制和边缘案例
Coverage Note
覆盖说明
Reference files cover major prompting techniques (zero-shot, few-shot, CoT, ReAct, tree-of-thoughts), structured output patterns (JSON mode, function calling), and model-specific guidance for GPT-4, Claude, and Gemini families. Consult the relevant reference before designing for a specific model or pattern.
参考文档涵盖了主要的提示词技术(零样本、少样本、思维链、ReAct、思维树)、结构化输出模式(JSON模式、函数调用),以及针对GPT-4、Claude和Gemini系列模型的专属指导。为特定模型或模式设计提示词前,请查阅相关参考文档。