llm-application-dev

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LLM Application Development

基于LLM的应用开发

Prompt Engineering

提示词工程

Structured Prompts

结构化提示词

typescript
const systemPrompt = `You are a helpful assistant that answers questions about our product.

RULES:
- Only answer questions about our product
- If you don't know, say "I don't know"
- Keep responses concise (under 100 words)
- Never make up information

CONTEXT:
{context}`;

const userPrompt = `Question: {question}`;
typescript
const systemPrompt = `You are a helpful assistant that answers questions about our product.

RULES:
- Only answer questions about our product
- If you don't know, say "I don't know"
- Keep responses concise (under 100 words)
- Never make up information

CONTEXT:
{context}`;

const userPrompt = `Question: {question}`;

Few-Shot Examples

少样本示例

typescript
const prompt = `Classify the sentiment of customer feedback.

Examples:
Input: "Love this product!"
Output: positive

Input: "Worst purchase ever"
Output: negative

Input: "It works fine"
Output: neutral

Input: "${customerFeedback}"
Output:`;
typescript
const prompt = `Classify the sentiment of customer feedback.

Examples:
Input: "Love this product!"
Output: positive

Input: "Worst purchase ever"
Output: negative

Input: "It works fine"
Output: neutral

Input: "${customerFeedback}"
Output:`;

Chain of Thought

思维链

typescript
const prompt = `Solve this step by step:

Question: ${question}

Let's think through this:
1. First, identify the key information
2. Then, determine the approach
3. Finally, calculate the answer

Step-by-step solution:`;
typescript
const prompt = `Solve this step by step:

Question: ${question}

Let's think through this:
1. First, identify the key information
2. Then, determine the approach
3. Finally, calculate the answer

Step-by-step solution:`;

API Integration

API集成

OpenAI Pattern

OpenAI调用模式

typescript
import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function chat(messages: Message[]): Promise<string> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4',
    messages,
    temperature: 0.7,
    max_tokens: 500,
  });

  return response.choices[0].message.content ?? '';
}
typescript
import OpenAI from 'openai';

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

async function chat(messages: Message[]): Promise<string> {
  const response = await openai.chat.completions.create({
    model: 'gpt-4',
    messages,
    temperature: 0.7,
    max_tokens: 500,
  });

  return response.choices[0].message.content ?? '';
}

Anthropic Pattern

Anthropic调用模式

typescript
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

async function chat(prompt: string): Promise<string> {
  const response = await anthropic.messages.create({
    model: 'claude-3-opus-20240229',
    max_tokens: 1024,
    messages: [{ role: 'user', content: prompt }],
  });

  return response.content[0].type === 'text'
    ? response.content[0].text
    : '';
}
typescript
import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

async function chat(prompt: string): Promise<string> {
  const response = await anthropic.messages.create({
    model: 'claude-3-opus-20240229',
    max_tokens: 1024,
    messages: [{ role: 'user', content: prompt }],
  });

  return response.content[0].type === 'text'
    ? response.content[0].text
    : '';
}

Streaming Responses

流式响应

typescript
async function* streamChat(prompt: string) {
  const stream = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: prompt }],
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content;
    if (content) yield content;
  }
}
typescript
async function* streamChat(prompt: string) {
  const stream = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: prompt }],
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content;
    if (content) yield content;
  }
}

RAG (Retrieval-Augmented Generation)

RAG(检索增强生成)

Basic RAG Pipeline

基础RAG流水线

typescript
async function ragQuery(question: string): Promise<string> {
  // 1. Embed the question
  const questionEmbedding = await embedText(question);

  // 2. Search vector database
  const relevantDocs = await vectorDb.search(questionEmbedding, { limit: 5 });

  // 3. Build context
  const context = relevantDocs.map(d => d.content).join('\n\n');

  // 4. Generate answer
  const prompt = `Answer based on this context:\n${context}\n\nQuestion: ${question}`;
  return await chat(prompt);
}
typescript
async function ragQuery(question: string): Promise<string> {
  // 1. Embed the question
  const questionEmbedding = await embedText(question);

  // 2. Search vector database
  const relevantDocs = await vectorDb.search(questionEmbedding, { limit: 5 });

  // 3. Build context
  const context = relevantDocs.map(d => d.content).join('\n\n');

  // 4. Generate answer
  const prompt = `Answer based on this context:\n${context}\n\nQuestion: ${question}`;
  return await chat(prompt);
}

Document Chunking

文档分块

typescript
function chunkDocument(text: string, options: ChunkOptions): string[] {
  const { chunkSize = 1000, overlap = 200 } = options;
  const chunks: string[] = [];

  let start = 0;
  while (start < text.length) {
    const end = Math.min(start + chunkSize, text.length);
    chunks.push(text.slice(start, end));
    start += chunkSize - overlap;
  }

  return chunks;
}
typescript
function chunkDocument(text: string, options: ChunkOptions): string[] {
  const { chunkSize = 1000, overlap = 200 } = options;
  const chunks: string[] = [];

  let start = 0;
  while (start < text.length) {
    const end = Math.min(start + chunkSize, text.length);
    chunks.push(text.slice(start, end));
    start += chunkSize - overlap;
  }

  return chunks;
}

Embedding Storage

嵌入向量存储

typescript
// Using Supabase with pgvector
async function storeEmbeddings(docs: Document[]) {
  for (const doc of docs) {
    const embedding = await embedText(doc.content);

    await supabase.from('documents').insert({
      content: doc.content,
      metadata: doc.metadata,
      embedding: embedding,  // vector column
    });
  }
}

async function searchSimilar(query: string, limit = 5) {
  const embedding = await embedText(query);

  const { data } = await supabase.rpc('match_documents', {
    query_embedding: embedding,
    match_count: limit,
  });

  return data;
}
typescript
// Using Supabase with pgvector
async function storeEmbeddings(docs: Document[]) {
  for (const doc of docs) {
    const embedding = await embedText(doc.content);

    await supabase.from('documents').insert({
      content: doc.content,
      metadata: doc.metadata,
      embedding: embedding,  // vector column
    });
  }
}

async function searchSimilar(query: string, limit = 5) {
  const embedding = await embedText(query);

  const { data } = await supabase.rpc('match_documents', {
    query_embedding: embedding,
    match_count: limit,
  });

  return data;
}

Error Handling

错误处理

typescript
async function safeLLMCall<T>(
  fn: () => Promise<T>,
  options: { retries?: number; fallback?: T }
): Promise<T> {
  const { retries = 3, fallback } = options;

  for (let i = 0; i < retries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (error.status === 429) {
        // Rate limit - exponential backoff
        await sleep(Math.pow(2, i) * 1000);
        continue;
      }
      if (i === retries - 1) {
        if (fallback !== undefined) return fallback;
        throw error;
      }
    }
  }
  throw new Error('Max retries exceeded');
}
typescript
async function safeLLMCall<T>(
  fn: () => Promise<T>,
  options: { retries?: number; fallback?: T }
): Promise<T> {
  const { retries = 3, fallback } = options;

  for (let i = 0; i < retries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (error.status === 429) {
        // Rate limit - exponential backoff
        await sleep(Math.pow(2, i) * 1000);
        continue;
      }
      if (i === retries - 1) {
        if (fallback !== undefined) return fallback;
        throw error;
      }
    }
  }
  throw new Error('Max retries exceeded');
}

Best Practices

最佳实践

  • Token Management: Track usage and set limits
  • Caching: Cache embeddings and common queries
  • Evaluation: Test prompts with diverse inputs
  • Guardrails: Validate outputs before using
  • Logging: Log prompts and responses for debugging
  • Cost Control: Use cheaper models for simple tasks
  • Latency: Stream responses for better UX
  • Privacy: Don't send PII to external APIs
  • Token管理:追踪使用情况并设置限制
  • 缓存:缓存嵌入向量和常见查询结果
  • 评估:使用多样化输入测试提示词
  • 防护机制:在使用前验证输出内容
  • 日志记录:记录提示词和响应以便调试
  • 成本控制:简单任务使用更便宜的模型
  • 延迟优化:使用流式响应提升用户体验
  • 隐私保护:不要将个人可识别信息(PII)发送至外部API