llm-patterns

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LLM Patterns Skill

LLM应用模式技能

Load with: base.md + [language].md
For AI-first applications where LLMs handle logical operations.

加载方式:base.md + [语言].md
适用于由LLM处理逻辑运算的AI优先应用。

Core Principle

核心原则

LLM for logic, code for plumbing.
Use LLMs for:
  • Classification, extraction, summarization
  • Decision-making with natural language reasoning
  • Content generation and transformation
  • Complex conditional logic that would be brittle in code
Use traditional code for:
  • Data validation (Zod/Pydantic)
  • API routing and HTTP handling
  • Database operations
  • Authentication/authorization
  • Orchestration and error handling

LLM负责逻辑处理,代码负责基础架构。
LLM的适用场景:
  • 分类、提取、摘要生成
  • 基于自然语言推理的决策制定
  • 内容生成与转换
  • 用代码实现会过于脆弱的复杂条件逻辑
传统代码的适用场景:
  • 数据验证(Zod/Pydantic)
  • API路由与HTTP处理
  • 数据库操作
  • 身份验证/授权
  • 编排与错误处理

Project Structure

项目结构

project/
├── src/
│   ├── core/
│   │   ├── prompts/           # Prompt templates
│   │   │   ├── classify.ts
│   │   │   └── extract.ts
│   │   ├── llm/               # LLM client and utilities
│   │   │   ├── client.ts      # LLM client wrapper
│   │   │   ├── schemas.ts     # Response schemas (Zod)
│   │   │   └── index.ts
│   │   └── services/          # Business logic using LLM
│   ├── infra/
│   └── ...
├── tests/
│   ├── unit/
│   ├── integration/
│   └── llm/                   # LLM-specific tests
│       ├── fixtures/          # Saved responses for deterministic tests
│       ├── evals/             # Evaluation test suites
│       └── mocks/             # Mock LLM responses
└── _project_specs/
    └── prompts/               # Prompt specifications

project/
├── src/
│   ├── core/
│   │   ├── prompts/           # 提示词模板
│   │   │   ├── classify.ts
│   │   │   └── extract.ts
│   │   ├── llm/               # LLM客户端与工具类
│   │   │   ├── client.ts      # LLM客户端封装
│   │   │   ├── schemas.ts     # 响应模式(Zod)
│   │   │   └── index.ts
│   │   └── services/          # 基于LLM的业务逻辑
│   ├── infra/
│   └── ...
├── tests/
│   ├── unit/
│   ├── integration/
│   └── llm/                   # LLM专属测试
│       ├── fixtures/          # 用于确定性测试的已保存响应
│       ├── evals/             # 评估测试套件
│       └── mocks/             # Mock LLM响应
└── _project_specs/
    └── prompts/               # 提示词规范

LLM Client Pattern

LLM客户端模式

Typed LLM Wrapper

类型化LLM封装

typescript
// core/llm/client.ts
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';

const client = new Anthropic();

interface LLMCallOptions<T> {
  prompt: string;
  schema: z.ZodSchema<T>;
  model?: string;
  maxTokens?: number;
}

export async function llmCall<T>({
  prompt,
  schema,
  model = 'claude-sonnet-4-20250514',
  maxTokens = 1024,
}: LLMCallOptions<T>): Promise<T> {
  const response = await client.messages.create({
    model,
    max_tokens: maxTokens,
    messages: [{ role: 'user', content: prompt }],
  });

  const text = response.content[0].type === 'text'
    ? response.content[0].text
    : '';

  // Parse and validate response
  const parsed = JSON.parse(text);
  return schema.parse(parsed);
}
typescript
// core/llm/client.ts
import Anthropic from '@anthropic-ai/sdk';
import { z } from 'zod';

const client = new Anthropic();

interface LLMCallOptions<T> {
  prompt: string;
  schema: z.ZodSchema<T>;
  model?: string;
  maxTokens?: number;
}

export async function llmCall<T>({
  prompt,
  schema,
  model = 'claude-sonnet-4-20250514',
  maxTokens = 1024,
}: LLMCallOptions<T>): Promise<T> {
  const response = await client.messages.create({
    model,
    max_tokens: maxTokens,
    messages: [{ role: 'user', content: prompt }],
  });

  const text = response.content[0].type === 'text'
    ? response.content[0].text
    : '';

  // 解析并验证响应
  const parsed = JSON.parse(text);
  return schema.parse(parsed);
}

Structured Outputs

结构化输出

typescript
// core/llm/schemas.ts
import { z } from 'zod';

export const ClassificationSchema = z.object({
  category: z.enum(['support', 'sales', 'feedback', 'other']),
  confidence: z.number().min(0).max(1),
  reasoning: z.string(),
});

export type Classification = z.infer<typeof ClassificationSchema>;

typescript
// core/llm/schemas.ts
import { z } from 'zod';

export const ClassificationSchema = z.object({
  category: z.enum(['support', 'sales', 'feedback', 'other']),
  confidence: z.number().min(0).max(1),
  reasoning: z.string(),
});

export type Classification = z.infer<typeof ClassificationSchema>;

Prompt Patterns

提示词模式

Template Functions

模板函数

typescript
// core/prompts/classify.ts
export function classifyTicketPrompt(ticket: string): string {
  return `Classify this support ticket into one of these categories:
- support: Technical issues or help requests
- sales: Pricing, plans, or purchase inquiries
- feedback: Suggestions or complaints
- other: Anything else

Respond with JSON:
{
  "category": "...",
  "confidence": 0.0-1.0,
  "reasoning": "brief explanation"
}

Ticket:
${ticket}`;
}
typescript
// core/prompts/classify.ts
export function classifyTicketPrompt(ticket: string): string {
  return `将此支持工单分类为以下类别之一:
- support: 技术问题或帮助请求
- sales: 定价、套餐或购买咨询
- feedback: 建议或投诉
- other: 其他任何内容

以JSON格式响应:
{
  "category": "...",
  "confidence": 0.0-1.0,
  "reasoning": "简要说明"
}

工单内容:
${ticket}`;
}

Prompt Versioning

提示词版本控制

typescript
// core/prompts/index.ts
export const PROMPTS = {
  classify: {
    v1: classifyTicketPromptV1,
    v2: classifyTicketPromptV2,  // improved accuracy
    current: classifyTicketPromptV2,
  },
} as const;

typescript
// core/prompts/index.ts
export const PROMPTS = {
  classify: {
    v1: classifyTicketPromptV1,
    v2: classifyTicketPromptV2,  // 准确性优化版
    current: classifyTicketPromptV2,
  },
} as const;

Testing LLM Calls

LLM调用测试

1. Unit Tests with Mocks (Fast, Deterministic)

1. 基于Mock的单元测试(快速、确定性)

typescript
// tests/llm/mocks/classify.mock.ts
export const mockClassifyResponse = {
  category: 'support',
  confidence: 0.95,
  reasoning: 'User is asking for help with login',
};

// tests/unit/services/ticket.test.ts
import { classifyTicket } from '../../../src/core/services/ticket';
import { mockClassifyResponse } from '../../llm/mocks/classify.mock';

// Mock the LLM client
vi.mock('../../../src/core/llm/client', () => ({
  llmCall: vi.fn().mockResolvedValue(mockClassifyResponse),
}));

describe('classifyTicket', () => {
  it('returns classification for ticket', async () => {
    const result = await classifyTicket('I cannot log in');

    expect(result.category).toBe('support');
    expect(result.confidence).toBeGreaterThan(0.9);
  });
});
typescript
// tests/llm/mocks/classify.mock.ts
export const mockClassifyResponse = {
  category: 'support',
  confidence: 0.95,
  reasoning: 'User is asking for help with login',
};

// tests/unit/services/ticket.test.ts
import { classifyTicket } from '../../../src/core/services/ticket';
import { mockClassifyResponse } from '../../llm/mocks/classify.mock';

// Mock LLM客户端
vi.mock('../../../src/core/llm/client', () => ({
  llmCall: vi.fn().mockResolvedValue(mockClassifyResponse),
}));

describe('classifyTicket', () => {
  it('返回工单分类结果', async () => {
    const result = await classifyTicket('I cannot log in');

    expect(result.category).toBe('support');
    expect(result.confidence).toBeGreaterThan(0.9);
  });
});

2. Fixture Tests (Deterministic, Tests Parsing)

2. 基于Fixture的测试(确定性、测试解析逻辑)

typescript
// tests/llm/fixtures/classify.fixtures.json
{
  "support_ticket": {
    "input": "I can't reset my password",
    "expected_category": "support",
    "raw_response": "{\"category\":\"support\",\"confidence\":0.98,\"reasoning\":\"Password reset is a support issue\"}"
  }
}

// tests/llm/classify.fixture.test.ts
import fixtures from './fixtures/classify.fixtures.json';
import { ClassificationSchema } from '../../src/core/llm/schemas';

describe('Classification Response Parsing', () => {
  Object.entries(fixtures).forEach(([name, fixture]) => {
    it(`parses ${name} correctly`, () => {
      const parsed = JSON.parse(fixture.raw_response);
      const result = ClassificationSchema.parse(parsed);

      expect(result.category).toBe(fixture.expected_category);
    });
  });
});
typescript
// tests/llm/fixtures/classify.fixtures.json
{
  "support_ticket": {
    "input": "I can't reset my password",
    "expected_category": "support",
    "raw_response": "{\"category\":\"support\",\"confidence\":0.98,\"reasoning\":\"Password reset is a support issue\"}"
  }
}

// tests/llm/classify.fixture.test.ts
import fixtures from './fixtures/classify.fixtures.json';
import { ClassificationSchema } from '../../src/core/llm/schemas';

describe('分类响应解析', () => {
  Object.entries(fixtures).forEach(([name, fixture]) => {
    it(`正确解析${name}`, () => {
      const parsed = JSON.parse(fixture.raw_response);
      const result = ClassificationSchema.parse(parsed);

      expect(result.category).toBe(fixture.expected_category);
    });
  });
});

3. Evaluation Tests (Slow, Run in CI nightly)

3. 评估测试(慢速、在CI中夜间运行)

typescript
// tests/llm/evals/classify.eval.test.ts
import { classifyTicket } from '../../../src/core/services/ticket';

const TEST_CASES = [
  { input: 'How much does the pro plan cost?', expected: 'sales' },
  { input: 'The app crashes when I click save', expected: 'support' },
  { input: 'You should add dark mode', expected: 'feedback' },
  { input: 'What time is it in Tokyo?', expected: 'other' },
];

describe('Classification Accuracy (Eval)', () => {
  // Skip in regular CI, run nightly
  const runEvals = process.env.RUN_LLM_EVALS === 'true';

  it.skipIf(!runEvals)('achieves >90% accuracy on test set', async () => {
    let correct = 0;

    for (const testCase of TEST_CASES) {
      const result = await classifyTicket(testCase.input);
      if (result.category === testCase.expected) correct++;
    }

    const accuracy = correct / TEST_CASES.length;
    expect(accuracy).toBeGreaterThan(0.9);
  }, 60000); // 60s timeout for LLM calls
});

typescript
// tests/llm/evals/classify.eval.test.ts
import { classifyTicket } from '../../../src/core/services/ticket';

const TEST_CASES = [
  { input: 'How much does the pro plan cost?', expected: 'sales' },
  { input: 'The app crashes when I click save', expected: 'support' },
  { input: 'You should add dark mode', expected: 'feedback' },
  { input: 'What time is it in Tokyo?', expected: 'other' },
];

describe('分类准确率(评估)', () => {
  // 常规CI中跳过,夜间运行
  const runEvals = process.env.RUN_LLM_EVALS === 'true';

  it.skipIf(!runEvals)('在测试集上准确率>90%', async () => {
    let correct = 0;

    for (const testCase of TEST_CASES) {
      const result = await classifyTicket(testCase.input);
      if (result.category === testCase.expected) correct++;
    }

    const accuracy = correct / TEST_CASES.length;
    expect(accuracy).toBeGreaterThan(0.9);
  }, 60000); // LLM调用超时时间60秒
});

GitHub Actions for LLM Tests

用于LLM测试的GitHub Actions

yaml
undefined
yaml
undefined

.github/workflows/quality.yml (add to existing)

.github/workflows/quality.yml(添加至现有文件)

jobs: quality: # ... existing steps ...
- name: Run Tests (with LLM mocks)
  run: npm run test:coverage
llm-evals: runs-on: ubuntu-latest # Run nightly or on-demand if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' steps: - uses: actions/checkout@v4
  - name: Setup Node
    uses: actions/setup-node@v4
    with:
      node-version: '20'

  - name: Install dependencies
    run: npm ci

  - name: Run LLM Evals
    run: npm run test:evals
    env:
      ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
      RUN_LLM_EVALS: 'true'

---
jobs: quality: # ... 现有步骤 ...
- name: 运行测试(使用LLM Mock)
  run: npm run test:coverage
llm-evals: runs-on: ubuntu-latest # 夜间运行或按需触发 if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' steps: - uses: actions/checkout@v4
  - name: 配置Node环境
    uses: actions/setup-node@v4
    with:
      node-version: '20'

  - name: 安装依赖
    run: npm ci

  - name: 运行LLM评估
    run: npm run test:evals
    env:
      ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
      RUN_LLM_EVALS: 'true'

---

Cost & Performance Tracking

成本与性能追踪

typescript
// core/llm/client.ts - add tracking
interface LLMMetrics {
  model: string;
  inputTokens: number;
  outputTokens: number;
  latencyMs: number;
  cost: number;
}

export async function llmCallWithMetrics<T>(
  options: LLMCallOptions<T>
): Promise<{ result: T; metrics: LLMMetrics }> {
  const start = Date.now();

  const response = await client.messages.create({...});

  const metrics: LLMMetrics = {
    model: options.model,
    inputTokens: response.usage.input_tokens,
    outputTokens: response.usage.output_tokens,
    latencyMs: Date.now() - start,
    cost: calculateCost(response.usage, options.model),
  };

  // Log or send to monitoring
  console.log('[LLM]', metrics);

  return { result: parsed, metrics };
}

typescript
// core/llm/client.ts - 添加追踪功能
interface LLMMetrics {
  model: string;
  inputTokens: number;
  outputTokens: number;
  latencyMs: number;
  cost: number;
}

export async function llmCallWithMetrics<T>(
  options: LLMCallOptions<T>
): Promise<{ result: T; metrics: LLMMetrics }> {
  const start = Date.now();

  const response = await client.messages.create({...});

  const metrics: LLMMetrics = {
    model: options.model,
    inputTokens: response.usage.input_tokens,
    outputTokens: response.usage.output_tokens,
    latencyMs: Date.now() - start,
    cost: calculateCost(response.usage, options.model),
  };

  // 记录或发送至监控系统
  console.log('[LLM]', metrics);

  return { result: parsed, metrics };
}

LLM Anti-Patterns

LLM反模式

  • ❌ Hardcoded prompts in business logic - use prompt templates
  • ❌ No schema validation on LLM responses - always use Zod
  • ❌ Testing with live LLM calls in CI - use mocks for unit tests
  • ❌ No cost tracking - monitor token usage
  • ❌ Ignoring latency - LLM calls are slow, design for async
  • ❌ No fallback for LLM failures - handle timeouts and errors
  • ❌ Prompts without version control - track prompt changes
  • ❌ No evaluation suite - measure accuracy over time
  • ❌ Using LLM for deterministic logic - use code for validation, auth, math
  • ❌ Giant monolithic prompts - compose smaller focused prompts
  • ❌ 在业务逻辑中硬编码提示词 - 应使用提示词模板
  • ❌ 不对LLM响应进行模式验证 - 始终使用Zod
  • ❌ 在CI中使用真实LLM调用进行测试 - 单元测试使用Mock
  • ❌ 不进行成本追踪 - 监控Token使用量
  • ❌ 忽略延迟问题 - LLM调用速度慢,需设计异步逻辑
  • ❌ 不为LLM故障提供回退方案 - 处理超时与错误
  • ❌ 不对提示词进行版本控制 - 追踪提示词变更
  • ❌ 无评估套件 - 随时间衡量准确率
  • ❌ 使用LLM处理确定性逻辑 - 验证、授权、数学运算等用代码实现
  • ❌ 使用巨型单体提示词 - 组合小型聚焦的提示词