openrouter

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

OpenRouter - Unified AI API Gateway

OpenRouter - 统一AI API网关

Overview

概述

OpenRouter provides a single API to access 200+ language models from OpenAI, Anthropic, Google, Meta, Mistral, and more. It offers intelligent routing, streaming, cost optimization, and standardized OpenAI-compatible interface.

Key Features:

Access 200+ models through one API
OpenAI-compatible interface (drop-in replacement)
Intelligent model routing and fallbacks
Real-time streaming responses
Cost tracking and optimization
Model performance analytics
Function calling support
Vision model support

Pricing Model:

Pay-per-token (no subscriptions)
Volume discounts available
Free tier with credits
Per-model pricing varies

Installation:

bash

npm install openai  # Use OpenAI SDK

OpenRouter提供统一API，可访问来自OpenAI、Anthropic、Google、Meta、Mistral等厂商的200+个语言模型。它具备智能路由、流式响应、成本优化功能，并提供标准化的OpenAI兼容接口。

核心特性:

通过一个API访问200+个模型
兼容OpenAI的接口（可直接替换使用）
智能模型路由与降级机制
实时流式响应
成本追踪与优化
模型性能分析
支持函数调用
支持视觉模型

定价模式:

按token付费（无订阅制）
提供批量折扣
包含免费额度
不同模型定价不同

安装:

bash

npm install openai  # 使用OpenAI SDK

or

pip install openai # Python

undefined

pip install openai # Python

undefined

Quick Start

快速开始

1. Get API Key

1. 获取API密钥

bash

undefined

bash

undefined

Sign up at https://openrouter.ai/keys

在 https://openrouter.ai/keys 注册账号

export OPENROUTER_API_KEY="sk-or-v1-..."

undefined

export OPENROUTER_API_KEY="sk-or-v1-..."

undefined

2. Basic Chat Completion

2. 基础聊天补全

typescript

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://openrouter.ai/api/v1',
  apiKey: process.env.OPENROUTER_API_KEY,
  defaultHeaders: {
    'HTTP-Referer': 'https://your-app.com',  // Optional
    'X-Title': 'Your App Name',              // Optional
  }
});

async function chat() {
  const completion = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [
      { role: 'user', content: 'Explain quantum computing in simple terms' }
    ],
  });

  console.log(completion.choices[0].message.content);
}

typescript

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://openrouter.ai/api/v1',
  apiKey: process.env.OPENROUTER_API_KEY,
  defaultHeaders: {
    'HTTP-Referer': 'https://your-app.com',  // 可选
    'X-Title': 'Your App Name',              // 可选
  }
});

async function chat() {
  const completion = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [
      { role: 'user', content: '用简单的语言解释量子计算' }
    ],
  });

  console.log(completion.choices[0].message.content);
}

3. Streaming Response

3. 流式响应

typescript

async function streamChat() {
  const stream = await client.chat.completions.create({
    model: 'openai/gpt-4-turbo',
    messages: [
      { role: 'user', content: 'Write a short story about AI' }
    ],
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content || '';
    process.stdout.write(content);
  }
}

typescript

async function streamChat() {
  const stream = await client.chat.completions.create({
    model: 'openai/gpt-4-turbo',
    messages: [
      { role: 'user', content: '写一个关于AI的短篇故事' }
    ],
    stream: true,
  });

  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content || '';
    process.stdout.write(content);
  }
}

Model Selection Strategy

模型选择策略

Available Model Categories

可用模型类别

Flagship Models (Highest Quality):

typescript

const flagshipModels = {
  claude: 'anthropic/claude-3.5-sonnet',      // Best reasoning
  gpt4: 'openai/gpt-4-turbo',                 // Best general purpose
  gemini: 'google/gemini-pro-1.5',            // Best long context
  opus: 'anthropic/claude-3-opus',            // Best complex tasks
};

Fast Models (Low Latency):

typescript

const fastModels = {
  claude: 'anthropic/claude-3-haiku',         // Fastest Claude
  gpt35: 'openai/gpt-3.5-turbo',             // Fast GPT
  gemini: 'google/gemini-flash-1.5',         // Fast Gemini
  llama: 'meta-llama/llama-3.1-8b-instruct', // Fast open source
};

Cost-Optimized Models:

typescript

const budgetModels = {
  haiku: 'anthropic/claude-3-haiku',          // $0.25/$1.25 per 1M tokens
  gemini: 'google/gemini-flash-1.5',         // $0.075/$0.30 per 1M tokens
  llama: 'meta-llama/llama-3.1-8b-instruct', // $0.06/$0.06 per 1M tokens
  mixtral: 'mistralai/mixtral-8x7b-instruct', // $0.24/$0.24 per 1M tokens
};

Specialized Models:

typescript

const specializedModels = {
  vision: 'openai/gpt-4-vision-preview',     // Image understanding
  code: 'anthropic/claude-3.5-sonnet',       // Code generation
  longContext: 'google/gemini-pro-1.5',      // 2M token context
  function: 'openai/gpt-4-turbo',            // Function calling
};

旗舰模型（最高质量）:

typescript

const flagshipModels = {
  claude: 'anthropic/claude-3.5-sonnet',      // 推理能力最佳
  gpt4: 'openai/gpt-4-turbo',                 // 通用场景最佳
  gemini: 'google/gemini-pro-1.5',            // 长上下文最佳
  opus: 'anthropic/claude-3-opus',            // 复杂任务最佳
};

快速模型（低延迟）:

typescript

const fastModels = {
  claude: 'anthropic/claude-3-haiku',         // 最快的Claude模型
  gpt35: 'openai/gpt-3.5-turbo',             // 快速GPT模型
  gemini: 'google/gemini-flash-1.5',         // 快速Gemini模型
  llama: 'meta-llama/llama-3.1-8b-instruct', // 快速开源模型
};

成本优化模型:

typescript

const budgetModels = {
  haiku: 'anthropic/claude-3-haiku',          // 每1M token $0.25/$1.25
  gemini: 'google/gemini-flash-1.5',         // 每1M token $0.075/$0.30
  llama: 'meta-llama/llama-3.1-8b-instruct', // 每1M token $0.06/$0.06
  mixtral: 'mistralai/mixtral-8x7b-instruct', // 每1M token $0.24/$0.24
};

专用模型:

typescript

const specializedModels = {
  vision: 'openai/gpt-4-vision-preview',     // 图像理解
  code: 'anthropic/claude-3.5-sonnet',       // 代码生成
  longContext: 'google/gemini-pro-1.5',      // 2M token上下文
  function: 'openai/gpt-4-turbo',            // 函数调用
};

Model Selection Logic

模型选择逻辑

typescript

interface ModelSelector {
  task: 'chat' | 'code' | 'vision' | 'function' | 'summary';
  priority: 'quality' | 'speed' | 'cost';
  maxCost?: number;  // Max cost per 1M tokens
  contextSize?: number;
}

function selectModel(criteria: ModelSelector): string {
  if (criteria.task === 'vision') {
    return 'openai/gpt-4-vision-preview';
  }

  if (criteria.task === 'code') {
    return criteria.priority === 'quality'
      ? 'anthropic/claude-3.5-sonnet'
      : 'meta-llama/llama-3.1-70b-instruct';
  }

  if (criteria.contextSize && criteria.contextSize > 100000) {
    return 'google/gemini-pro-1.5';  // 2M context
  }

  // Default selection by priority
  switch (criteria.priority) {
    case 'quality':
      return 'anthropic/claude-3.5-sonnet';
    case 'speed':
      return 'anthropic/claude-3-haiku';
    case 'cost':
      return criteria.maxCost && criteria.maxCost < 0.5
        ? 'google/gemini-flash-1.5'
        : 'anthropic/claude-3-haiku';
    default:
      return 'openai/gpt-4-turbo';
  }
}

// Usage
const model = selectModel({
  task: 'code',
  priority: 'quality',
});

typescript

interface ModelSelector {
  task: 'chat' | 'code' | 'vision' | 'function' | 'summary';
  priority: 'quality' | 'speed' | 'cost';
  maxCost?: number;  // 每1M token的最大成本
  contextSize?: number;
}

function selectModel(criteria: ModelSelector): string {
  if (criteria.task === 'vision') {
    return 'openai/gpt-4-vision-preview';
  }

  if (criteria.task === 'code') {
    return criteria.priority === 'quality'
      ? 'anthropic/claude-3.5-sonnet'
      : 'meta-llama/llama-3.1-70b-instruct';
  }

  if (criteria.contextSize && criteria.contextSize > 100000) {
    return 'google/gemini-pro-1.5';  // 2M上下文
  }

  // 根据优先级默认选择
  switch (criteria.priority) {
    case 'quality':
      return 'anthropic/claude-3.5-sonnet';
    case 'speed':
      return 'anthropic/claude-3-haiku';
    case 'cost':
      return criteria.maxCost && criteria.maxCost < 0.5
        ? 'google/gemini-flash-1.5'
        : 'anthropic/claude-3-haiku';
    default:
      return 'openai/gpt-4-turbo';
  }
}

// 使用示例
const model = selectModel({
  task: 'code',
  priority: 'quality',
});

Streaming Implementation

流式响应实现

TypeScript Streaming with Error Handling

带错误处理的TypeScript流式响应

typescript

async function robustStreamingChat(
  prompt: string,
  model: string = 'anthropic/claude-3.5-sonnet'
) {
  try {
    const stream = await client.chat.completions.create({
      model,
      messages: [{ role: 'user', content: prompt }],
      stream: true,
      max_tokens: 4000,
    });

    let fullResponse = '';

    for await (const chunk of stream) {
      const delta = chunk.choices[0]?.delta;

      if (delta?.content) {
        fullResponse += delta.content;
        process.stdout.write(delta.content);
      }

      // Handle function calls
      if (delta?.function_call) {
        console.log('\nFunction call:', delta.function_call);
      }

      // Check for finish reason
      if (chunk.choices[0]?.finish_reason) {
        console.log(`\n[Finished: ${chunk.choices[0].finish_reason}]`);
      }
    }

    return fullResponse;
  } catch (error) {
    if (error instanceof Error) {
      console.error('Streaming error:', error.message);
    }
    throw error;
  }
}

typescript

async function robustStreamingChat(
  prompt: string,
  model: string = 'anthropic/claude-3.5-sonnet'
) {
  try {
    const stream = await client.chat.completions.create({
      model,
      messages: [{ role: 'user', content: prompt }],
      stream: true,
      max_tokens: 4000,
    });

    let fullResponse = '';

    for await (const chunk of stream) {
      const delta = chunk.choices[0]?.delta;

      if (delta?.content) {
        fullResponse += delta.content;
        process.stdout.write(delta.content);
      }

      // 处理函数调用
      if (delta?.function_call) {
        console.log('\n函数调用:', delta.function_call);
      }

      // 检查结束原因
      if (chunk.choices[0]?.finish_reason) {
        console.log(`\n[已完成: ${chunk.choices[0].finish_reason}]`);
      }
    }

    return fullResponse;
  } catch (error) {
    if (error instanceof Error) {
      console.error('流式响应错误:', error.message);
    }
    throw error;
  }
}

Python Streaming

Python流式响应

python

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ.get("OPENROUTER_API_KEY"),
)

def stream_chat(prompt: str, model: str = "anthropic/claude-3.5-sonnet"):
    stream = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        stream=True,
    )

    full_response = ""
    for chunk in stream:
        if chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            full_response += content
            print(content, end="", flush=True)

    print()  # New line
    return full_response

python

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ.get("OPENROUTER_API_KEY"),
)

def stream_chat(prompt: str, model: str = "anthropic/claude-3.5-sonnet"):
    stream = client.chat.completions.create(
        model=model,
        messages=[{"role": "user", "content": prompt}],
        stream=True,
    )

    full_response = ""
    for chunk in stream:
        if chunk.choices[0].delta.content:
            content = chunk.choices[0].delta.content
            full_response += content
            print(content, end="", flush=True)

    print()  # 换行
    return full_response

React Streaming Component

React流式响应组件

typescript

import { useState } from 'react';

function StreamingChat() {
  const [response, setResponse] = useState('');
  const [isStreaming, setIsStreaming] = useState(false);

  async function handleSubmit(prompt: string) {
    setIsStreaming(true);
    setResponse('');

    try {
      const res = await fetch('https://openrouter.ai/api/v1/chat/completions', {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${process.env.OPENROUTER_API_KEY}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'anthropic/claude-3.5-sonnet',
          messages: [{ role: 'user', content: prompt }],
          stream: true,
        }),
      });

      const reader = res.body?.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader!.read();
        if (done) break;

        const chunk = decoder.decode(value);
        const lines = chunk.split('\n').filter(line => line.trim());

        for (const line of lines) {
          if (line.startsWith('data: ')) {
            const data = line.slice(6);
            if (data === '[DONE]') continue;

            try {
              const parsed = JSON.parse(data);
              const content = parsed.choices[0]?.delta?.content || '';
              setResponse(prev => prev + content);
            } catch (e) {
              // Skip invalid JSON
            }
          }
        }
      }
    } catch (error) {
      console.error('Streaming error:', error);
    } finally {
      setIsStreaming(false);
    }
  }

  return (
    <div>
      <textarea
        value={response}
        readOnly
        rows={20}
        cols={80}
        placeholder="Response will appear here..."
      />
      <button onClick={() => handleSubmit('Explain AI')}>
        {isStreaming ? 'Streaming...' : 'Send'}
      </button>
    </div>
  );
}

typescript

import { useState } from 'react';

function StreamingChat() {
  const [response, setResponse] = useState('');
  const [isStreaming, setIsStreaming] = useState(false);

  async function handleSubmit(prompt: string) {
    setIsStreaming(true);
    setResponse('');

    try {
      const res = await fetch('https://openrouter.ai/api/v1/chat/completions', {
        method: 'POST',
        headers: {
          'Authorization': `Bearer ${process.env.OPENROUTER_API_KEY}`,
          'Content-Type': 'application/json',
        },
        body: JSON.stringify({
          model: 'anthropic/claude-3.5-sonnet',
          messages: [{ role: 'user', content: prompt }],
          stream: true,
        }),
      });

      const reader = res.body?.getReader();
      const decoder = new TextDecoder();

      while (true) {
        const { done, value } = await reader!.read();
        if (done) break;

        const chunk = decoder.decode(value);
        const lines = chunk.split('\n').filter(line => line.trim());

        for (const line of lines) {
          if (line.startsWith('data: ')) {
            const data = line.slice(6);
            if (data === '[DONE]') continue;

            try {
              const parsed = JSON.parse(data);
              const content = parsed.choices[0]?.delta?.content || '';
              setResponse(prev => prev + content);
            } catch (e) {
              // 跳过无效JSON
            }
          }
        }
      }
    } catch (error) {
      console.error('流式响应错误:', error);
    } finally {
      setIsStreaming(false);
    }
  }

  return (
    <div>
      <textarea
        value={response}
        readOnly
        rows={20}
        cols={80}
        placeholder="响应将显示在这里..."
      />
      <button onClick={() => handleSubmit('解释AI')}>
        {isStreaming ? '流式响应中...' : '发送'}
      </button>
    </div>
  );
}

Function Calling

函数调用

Basic Function Calling

基础函数调用

typescript

const tools = [
  {
    type: 'function',
    function: {
      name: 'get_weather',
      description: 'Get current weather for a location',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: 'City name, e.g. San Francisco',
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit'],
          },
        },
        required: ['location'],
      },
    },
  },
];

async function chatWithFunctions() {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4-turbo',
    messages: [
      { role: 'user', content: 'What is the weather in Tokyo?' }
    ],
    tools,
    tool_choice: 'auto',
  });

  const message = completion.choices[0].message;

  if (message.tool_calls) {
    for (const toolCall of message.tool_calls) {
      console.log('Function:', toolCall.function.name);
      console.log('Arguments:', toolCall.function.arguments);

      // Execute function
      const args = JSON.parse(toolCall.function.arguments);
      const result = await getWeather(args.location, args.unit);

      // Send result back
      const followUp = await client.chat.completions.create({
        model: 'openai/gpt-4-turbo',
        messages: [
          { role: 'user', content: 'What is the weather in Tokyo?' },
          message,
          {
            role: 'tool',
            tool_call_id: toolCall.id,
            content: JSON.stringify(result),
          },
        ],
        tools,
      });

      console.log(followUp.choices[0].message.content);
    }
  }
}

typescript

const tools = [
  {
    type: 'function',
    function: {
      name: 'get_weather',
      description: '获取指定地点的当前天气',
      parameters: {
        type: 'object',
        properties: {
          location: {
            type: 'string',
            description: '城市名称，例如：旧金山',
          },
          unit: {
            type: 'string',
            enum: ['celsius', 'fahrenheit'],
          },
        },
        required: ['location'],
      },
    },
  },
];

async function chatWithFunctions() {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4-turbo',
    messages: [
      { role: 'user', content: '东京的天气怎么样？' }
    ],
    tools,
    tool_choice: 'auto',
  });

  const message = completion.choices[0].message;

  if (message.tool_calls) {
    for (const toolCall of message.tool_calls) {
      console.log('函数:', toolCall.function.name);
      console.log('参数:', toolCall.function.arguments);

      // 执行函数
      const args = JSON.parse(toolCall.function.arguments);
      const result = await getWeather(args.location, args.unit);

      // 将结果返回给模型
      const followUp = await client.chat.completions.create({
        model: 'openai/gpt-4-turbo',
        messages: [
          { role: 'user', content: '东京的天气怎么样？' },
          message,
          {
            role: 'tool',
            tool_call_id: toolCall.id,
            content: JSON.stringify(result),
          },
        ],
        tools,
      });

      console.log(followUp.choices[0].message.content);
    }
  }
}

Multi-Step Function Calling

多步骤函数调用

typescript

async function multiStepFunctionCall(userQuery: string) {
  const messages = [{ role: 'user', content: userQuery }];
  let iterationCount = 0;
  const maxIterations = 5;

  while (iterationCount < maxIterations) {
    const completion = await client.chat.completions.create({
      model: 'openai/gpt-4-turbo',
      messages,
      tools,
      tool_choice: 'auto',
    });

    const message = completion.choices[0].message;
    messages.push(message);

    if (!message.tool_calls) {
      // No more function calls, return final response
      return message.content;
    }

    // Execute all function calls
    for (const toolCall of message.tool_calls) {
      const functionName = toolCall.function.name;
      const args = JSON.parse(toolCall.function.arguments);

      // Execute function (implement your function registry)
      const result = await executeFunctionCall(functionName, args);

      messages.push({
        role: 'tool',
        tool_call_id: toolCall.id,
        content: JSON.stringify(result),
      });
    }

    iterationCount++;
  }

  throw new Error('Max iterations reached');
}

typescript

async function multiStepFunctionCall(userQuery: string) {
  const messages = [{ role: 'user', content: userQuery }];
  let iterationCount = 0;
  const maxIterations = 5;

  while (iterationCount < maxIterations) {
    const completion = await client.chat.completions.create({
      model: 'openai/gpt-4-turbo',
      messages,
      tools,
      tool_choice: 'auto',
    });

    const message = completion.choices[0].message;
    messages.push(message);

    if (!message.tool_calls) {
      // 没有更多函数调用，返回最终响应
      return message.content;
    }

    // 执行所有函数调用
    for (const toolCall of message.tool_calls) {
      const functionName = toolCall.function.name;
      const args = JSON.parse(toolCall.function.arguments);

      // 执行函数（实现你的函数注册机制）
      const result = await executeFunctionCall(functionName, args);

      messages.push({
        role: 'tool',
        tool_call_id: toolCall.id,
        content: JSON.stringify(result),
      });
    }

    iterationCount++;
  }

  throw new Error('已达到最大迭代次数');
}

Cost Optimization

成本优化

Token Counting and Cost Estimation

Token计数与成本估算

typescript

import { encoding_for_model } from 'tiktoken';

interface CostEstimate {
  promptTokens: number;
  completionTokens: number;
  promptCost: number;
  completionCost: number;
  totalCost: number;
}

const modelPricing = {
  'anthropic/claude-3.5-sonnet': { input: 3.00, output: 15.00 },  // per 1M tokens
  'anthropic/claude-3-haiku': { input: 0.25, output: 1.25 },
  'openai/gpt-4-turbo': { input: 10.00, output: 30.00 },
  'openai/gpt-3.5-turbo': { input: 0.50, output: 1.50 },
  'google/gemini-flash-1.5': { input: 0.075, output: 0.30 },
};

function estimateCost(
  prompt: string,
  expectedCompletion: number,
  model: string
): CostEstimate {
  const encoder = encoding_for_model('gpt-4');  // Approximation
  const promptTokens = encoder.encode(prompt).length;
  const completionTokens = expectedCompletion;

  const pricing = modelPricing[model] || { input: 0, output: 0 };

  const promptCost = (promptTokens / 1_000_000) * pricing.input;
  const completionCost = (completionTokens / 1_000_000) * pricing.output;

  return {
    promptTokens,
    completionTokens,
    promptCost,
    completionCost,
    totalCost: promptCost + completionCost,
  };
}

// Usage
const estimate = estimateCost(
  'Explain quantum computing',
  500,  // Expected response tokens
  'anthropic/claude-3.5-sonnet'
);

console.log(`Estimated cost: $${estimate.totalCost.toFixed(4)}`);

typescript

import { encoding_for_model } from 'tiktoken';

interface CostEstimate {
  promptTokens: number;
  completionTokens: number;
  promptCost: number;
  completionCost: number;
  totalCost: number;
}

const modelPricing = {
  'anthropic/claude-3.5-sonnet': { input: 3.00, output: 15.00 },  // 每1M token
  'anthropic/claude-3-haiku': { input: 0.25, output: 1.25 },
  'openai/gpt-4-turbo': { input: 10.00, output: 30.00 },
  'openai/gpt-3.5-turbo': { input: 0.50, output: 1.50 },
  'google/gemini-flash-1.5': { input: 0.075, output: 0.30 },
};

function estimateCost(
  prompt: string,
  expectedCompletion: number,
  model: string
): CostEstimate {
  const encoder = encoding_for_model('gpt-4');  // 近似估算
  const promptTokens = encoder.encode(prompt).length;
  const completionTokens = expectedCompletion;

  const pricing = modelPricing[model] || { input: 0, output: 0 };

  const promptCost = (promptTokens / 1_000_000) * pricing.input;
  const completionCost = (completionTokens / 1_000_000) * pricing.output;

  return {
    promptTokens,
    completionTokens,
    promptCost,
    completionCost,
    totalCost: promptCost + completionCost,
  };
}

// 使用示例
const estimate = estimateCost(
  '解释量子计算',
  500,  // 预期响应token数
  'anthropic/claude-3.5-sonnet'
);

console.log(`估算成本: $${estimate.totalCost.toFixed(4)}`);

Dynamic Model Selection by Budget

基于预算的动态模型选择

typescript

async function budgetOptimizedChat(
  prompt: string,
  maxCostPerRequest: number = 0.01  // $0.01 max
) {
  // Estimate with expensive model
  const expensiveEstimate = estimateCost(
    prompt,
    1000,
    'anthropic/claude-3.5-sonnet'
  );

  let selectedModel = 'anthropic/claude-3.5-sonnet';

  if (expensiveEstimate.totalCost > maxCostPerRequest) {
    // Try cheaper models
    const cheapEstimate = estimateCost(
      prompt,
      1000,
      'anthropic/claude-3-haiku'
    );

    if (cheapEstimate.totalCost > maxCostPerRequest) {
      selectedModel = 'google/gemini-flash-1.5';
    } else {
      selectedModel = 'anthropic/claude-3-haiku';
    }
  }

  console.log(`Selected model: ${selectedModel}`);

  const completion = await client.chat.completions.create({
    model: selectedModel,
    messages: [{ role: 'user', content: prompt }],
  });

  return completion.choices[0].message.content;
}

typescript

async function budgetOptimizedChat(
  prompt: string,
  maxCostPerRequest: number = 0.01  // 最大0.01美元
) {
  // 用高价模型估算成本
  const expensiveEstimate = estimateCost(
    prompt,
    1000,
    'anthropic/claude-3.5-sonnet'
  );

  let selectedModel = 'anthropic/claude-3.5-sonnet';

  if (expensiveEstimate.totalCost > maxCostPerRequest) {
    // 尝试更便宜的模型
    const cheapEstimate = estimateCost(
      prompt,
      1000,
      'anthropic/claude-3-haiku'
    );

    if (cheapEstimate.totalCost > maxCostPerRequest) {
      selectedModel = 'google/gemini-flash-1.5';
    } else {
      selectedModel = 'anthropic/claude-3-haiku';
    }
  }

  console.log(`选中模型: ${selectedModel}`);

  const completion = await client.chat.completions.create({
    model: selectedModel,
    messages: [{ role: 'user', content: prompt }],
  });

  return completion.choices[0].message.content;
}

Batching for Cost Reduction

批量处理降低成本

typescript

async function batchProcess(prompts: string[], model: string) {
  // Process multiple prompts in parallel with rate limiting
  const concurrency = 5;
  const results = [];

  for (let i = 0; i < prompts.length; i += concurrency) {
    const batch = prompts.slice(i, i + concurrency);

    const batchResults = await Promise.all(
      batch.map(prompt =>
        client.chat.completions.create({
          model,
          messages: [{ role: 'user', content: prompt }],
          max_tokens: 500,  // Limit tokens to control cost
        })
      )
    );

    results.push(...batchResults);

    // Rate limiting delay
    if (i + concurrency < prompts.length) {
      await new Promise(resolve => setTimeout(resolve, 1000));
    }
  }

  return results;
}

typescript

async function batchProcess(prompts: string[], model: string) {
  // 带速率限制的并行处理多个请求
  const concurrency = 5;
  const results = [];

  for (let i = 0; i < prompts.length; i += concurrency) {
    const batch = prompts.slice(i, i + concurrency);

    const batchResults = await Promise.all(
      batch.map(prompt =>
        client.chat.completions.create({
          model,
          messages: [{ role: 'user', content: prompt }],
          max_tokens: 500,  // 限制token数控制成本
        })
      )
    );

    results.push(...batchResults);

    // 速率限制延迟
    if (i + concurrency < prompts.length) {
      await new Promise(resolve => setTimeout(resolve, 1000));
    }
  }

  return results;
}

Model Fallback and Retry Strategy

模型降级与重试策略

Automatic Fallback

自动降级

typescript

const modelFallbackChain = [
  'anthropic/claude-3.5-sonnet',
  'openai/gpt-4-turbo',
  'anthropic/claude-3-haiku',
  'google/gemini-flash-1.5',
];

async function chatWithFallback(
  prompt: string,
  maxRetries: number = 3
): Promise<string> {
  for (const model of modelFallbackChain) {
    try {
      console.log(`Trying model: ${model}`);

      const completion = await client.chat.completions.create({
        model,
        messages: [{ role: 'user', content: prompt }],
        max_tokens: 2000,
      });

      return completion.choices[0].message.content || '';
    } catch (error) {
      console.warn(`Model ${model} failed:`, error);

      // Continue to next model
      if (model === modelFallbackChain[modelFallbackChain.length - 1]) {
        throw new Error('All models failed');
      }
    }
  }

  throw new Error('No models available');
}

typescript

const modelFallbackChain = [
  'anthropic/claude-3.5-sonnet',
  'openai/gpt-4-turbo',
  'anthropic/claude-3-haiku',
  'google/gemini-flash-1.5',
];

async function chatWithFallback(
  prompt: string,
  maxRetries: number = 3
): Promise<string> {
  for (const model of modelFallbackChain) {
    try {
      console.log(`尝试模型: ${model}`);

      const completion = await client.chat.completions.create({
        model,
        messages: [{ role: 'user', content: prompt }],
        max_tokens: 2000,
      });

      return completion.choices[0].message.content || '';
    } catch (error) {
      console.warn(`模型 ${model} 调用失败:`, error);

      // 尝试下一个模型
      if (model === modelFallbackChain[modelFallbackChain.length - 1]) {
        throw new Error('所有模型调用失败');
      }
    }
  }

  throw new Error('无可用模型');
}

Exponential Backoff for Rate Limits

速率限制的指数退避重试

typescript

async function retryWithBackoff<T>(
  fn: () => Promise<T>,
  maxRetries: number = 5
): Promise<T> {
  let lastError: Error;

  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error as Error;

      // Check if rate limit error
      if (error.status === 429) {
        const delay = Math.pow(2, i) * 1000;  // Exponential backoff
        console.log(`Rate limited. Retrying in ${delay}ms...`);
        await new Promise(resolve => setTimeout(resolve, delay));
      } else {
        throw error;  // Non-retryable error
      }
    }
  }

  throw lastError!;
}

// Usage
const result = await retryWithBackoff(() =>
  client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [{ role: 'user', content: 'Hello' }],
  })
);

typescript

async function retryWithBackoff<T>(
  fn: () => Promise<T>,
  maxRetries: number = 5
): Promise<T> {
  let lastError: Error;

  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error as Error;

      // 检查是否是速率限制错误
      if (error.status === 429) {
        const delay = Math.pow(2, i) * 1000;  // 指数退避延迟
        console.log(`触发速率限制，${delay}ms后重试...`);
        await new Promise(resolve => setTimeout(resolve, delay));
      } else {
        throw error;  // 不可重试的错误
      }
    }
  }

  throw lastError!;
}

// 使用示例
const result = await retryWithBackoff(() =>
  client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [{ role: 'user', content: '你好' }],
  })
);

Prompt Engineering Best Practices

提示词工程最佳实践

System Prompts for Consistency

系统提示词保证一致性

typescript

const systemPrompts = {
  concise: 'You are a helpful assistant. Be concise and direct.',
  detailed: 'You are a knowledgeable expert. Provide comprehensive answers with examples.',
  code: 'You are an expert programmer. Provide clean, well-commented code with explanations.',
  creative: 'You are a creative writing assistant. Be imaginative and engaging.',
};

async function chatWithPersonality(
  prompt: string,
  personality: keyof typeof systemPrompts
) {
  const completion = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [
      { role: 'system', content: systemPrompts[personality] },
      { role: 'user', content: prompt },
    ],
  });

  return completion.choices[0].message.content;
}

typescript

const systemPrompts = {
  concise: '你是一个乐于助人的助手，回答要简洁直接。',
  detailed: '你是一位知识渊博的专家，提供全面的答案并附带示例。',
  code: '你是一位资深程序员，提供清晰、注释完善的代码及解释。',
  creative: '你是一位创意写作助手，回答要富有想象力且引人入胜。',
};

async function chatWithPersonality(
  prompt: string,
  personality: keyof typeof systemPrompts
) {
  const completion = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [
      { role: 'system', content: systemPrompts[personality] },
      { role: 'user', content: prompt },
    ],
  });

  return completion.choices[0].message.content;
}

Few-Shot Prompting

少样本提示

typescript

async function fewShotClassification(text: string) {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4-turbo',
    messages: [
      {
        role: 'system',
        content: 'Classify text sentiment as positive, negative, or neutral.',
      },
      { role: 'user', content: 'I love this product!' },
      { role: 'assistant', content: 'positive' },
      { role: 'user', content: 'This is terrible.' },
      { role: 'assistant', content: 'negative' },
      { role: 'user', content: 'It works fine.' },
      { role: 'assistant', content: 'neutral' },
      { role: 'user', content: text },
    ],
  });

  return completion.choices[0].message.content;
}

typescript

async function fewShotClassification(text: string) {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4-turbo',
    messages: [
      {
        role: 'system',
        content: '将文本情感分类为积极、消极或中性。',
      },
      { role: 'user', content: '我喜欢这个产品！' },
      { role: 'assistant', content: 'positive' },
      { role: 'user', content: '这太糟糕了。' },
      { role: 'assistant', content: 'negative' },
      { role: 'user', content: '它能用。' },
      { role: 'assistant', content: 'neutral' },
      { role: 'user', content: text },
    ],
  });

  return completion.choices[0].message.content;
}

Chain of Thought Prompting

思维链提示

typescript

async function reasoningTask(problem: string) {
  const completion = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [
      {
        role: 'user',
        content: `${problem}\n\nLet's solve this step by step:\n1.`,
      },
    ],
    max_tokens: 3000,
  });

  return completion.choices[0].message.content;
}

typescript

async function reasoningTask(problem: string) {
  const completion = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [
      {
        role: 'user',
        content: `${problem}\n\n让我们一步步解决这个问题:\n1.`,
      },
    ],
    max_tokens: 3000,
  });

  return completion.choices[0].message.content;
}

Rate Limits and Throttling

速率限制与流量控制

Rate Limit Handler

速率限制处理器

typescript

class RateLimitedClient {
  private requestQueue: Array<() => Promise<any>> = [];
  private processing = false;
  private requestsPerMinute = 60;
  private requestInterval = 60000 / this.requestsPerMinute;

  async enqueue<T>(request: () => Promise<T>): Promise<T> {
    return new Promise((resolve, reject) => {
      this.requestQueue.push(async () => {
        try {
          const result = await request();
          resolve(result);
        } catch (error) {
          reject(error);
        }
      });

      this.processQueue();
    });
  }

  private async processQueue() {
    if (this.processing || this.requestQueue.length === 0) return;

    this.processing = true;

    while (this.requestQueue.length > 0) {
      const request = this.requestQueue.shift()!;
      await request();
      await new Promise(resolve => setTimeout(resolve, this.requestInterval));
    }

    this.processing = false;
  }
}

// Usage
const rateLimitedClient = new RateLimitedClient();

const result = await rateLimitedClient.enqueue(() =>
  client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [{ role: 'user', content: 'Hello' }],
  })
);

typescript

class RateLimitedClient {
  private requestQueue: Array<() => Promise<any>> = [];
  private processing = false;
  private requestsPerMinute = 60;
  private requestInterval = 60000 / this.requestsPerMinute;

  async enqueue<T>(request: () => Promise<T>): Promise<T> {
    return new Promise((resolve, reject) => {
      this.requestQueue.push(async () => {
        try {
          const result = await request();
          resolve(result);
        } catch (error) {
          reject(error);
        }
      });

      this.processQueue();
    });
  }

  private async processQueue() {
    if (this.processing || this.requestQueue.length === 0) return;

    this.processing = true;

    while (this.requestQueue.length > 0) {
      const request = this.requestQueue.shift()!;
      await request();
      await new Promise(resolve => setTimeout(resolve, this.requestInterval));
    }

    this.processing = false;
  }
}

// 使用示例
const rateLimitedClient = new RateLimitedClient();

const result = await rateLimitedClient.enqueue(() =>
  client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [{ role: 'user', content: '你好' }],
  })
);

Vision Models

视觉模型

Image Understanding

图像理解

typescript

async function analyzeImage(imageUrl: string, question: string) {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4-vision-preview',
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: question },
          { type: 'image_url', image_url: { url: imageUrl } },
        ],
      },
    ],
    max_tokens: 1000,
  });

  return completion.choices[0].message.content;
}

// Usage
const result = await analyzeImage(
  'https://example.com/image.jpg',
  'What objects are in this image?'
);

typescript

async function analyzeImage(imageUrl: string, question: string) {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4-vision-preview',
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: question },
          { type: 'image_url', image_url: { url: imageUrl } },
        ],
      },
    ],
    max_tokens: 1000,
  });

  return completion.choices[0].message.content;
}

// 使用示例
const result = await analyzeImage(
  'https://example.com/image.jpg',
  '这张图片里有什么物体？'
);

Multi-Image Analysis

多图像对比分析

typescript

async function compareImages(imageUrls: string[]) {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4-vision-preview',
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Compare these images and describe the differences:' },
          ...imageUrls.map(url => ({
            type: 'image_url' as const,
            image_url: { url },
          })),
        ],
      },
    ],
  });

  return completion.choices[0].message.content;
}

typescript

async function compareImages(imageUrls: string[]) {
  const completion = await client.chat.completions.create({
    model: 'openai/gpt-4-vision-preview',
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: '对比这些图片，描述它们的差异:' },
          ...imageUrls.map(url => ({
            type: 'image_url' as const,
            image_url: { url },
          })),
        ],
      },
    ],
  });

  return completion.choices[0].message.content;
}

Error Handling and Monitoring

错误处理与监控

Comprehensive Error Handler

全面错误处理器

typescript

interface ErrorResponse {
  error: {
    message: string;
    type: string;
    code: string;
  };
}

async function robustCompletion(prompt: string) {
  try {
    const completion = await client.chat.completions.create({
      model: 'anthropic/claude-3.5-sonnet',
      messages: [{ role: 'user', content: prompt }],
    });

    return completion.choices[0].message.content;
  } catch (error: any) {
    // Rate limit errors
    if (error.status === 429) {
      console.error('Rate limit exceeded. Please wait.');
      throw new Error('RATE_LIMIT_EXCEEDED');
    }

    // Invalid API key
    if (error.status === 401) {
      console.error('Invalid API key');
      throw new Error('INVALID_API_KEY');
    }

    // Model not found
    if (error.status === 404) {
      console.error('Model not found');
      throw new Error('MODEL_NOT_FOUND');
    }

    // Server errors
    if (error.status >= 500) {
      console.error('OpenRouter server error');
      throw new Error('SERVER_ERROR');
    }

    // Unknown error
    console.error('Unknown error:', error);
    throw error;
  }
}

typescript

interface ErrorResponse {
  error: {
    message: string;
    type: string;
    code: string;
  };
}

async function robustCompletion(prompt: string) {
  try {
    const completion = await client.chat.completions.create({
      model: 'anthropic/claude-3.5-sonnet',
      messages: [{ role: 'user', content: prompt }],
    });

    return completion.choices[0].message.content;
  } catch (error: any) {
    // 速率限制错误
    if (error.status === 429) {
      console.error('超出速率限制，请稍后再试。');
      throw new Error('RATE_LIMIT_EXCEEDED');
    }

    // 无效API密钥
    if (error.status === 401) {
      console.error('无效的API密钥');
      throw new Error('INVALID_API_KEY');
    }

    // 模型未找到
    if (error.status === 404) {
      console.error('模型未找到');
      throw new Error('MODEL_NOT_FOUND');
    }

    // 服务器错误
    if (error.status >= 500) {
      console.error('OpenRouter服务器错误');
      throw new Error('SERVER_ERROR');
    }

    // 未知错误
    console.error('未知错误:', error);
    throw error;
  }
}

Request/Response Logging

请求/响应日志

typescript

class LoggingClient {
  async chat(prompt: string, model: string) {
    const startTime = Date.now();

    console.log('[Request]', {
      timestamp: new Date().toISOString(),
      model,
      promptLength: prompt.length,
    });

    try {
      const completion = await client.chat.completions.create({
        model,
        messages: [{ role: 'user', content: prompt }],
      });

      const duration = Date.now() - startTime;

      console.log('[Response]', {
        timestamp: new Date().toISOString(),
        duration,
        usage: completion.usage,
        finishReason: completion.choices[0].finish_reason,
      });

      return completion;
    } catch (error) {
      console.error('[Error]', {
        timestamp: new Date().toISOString(),
        duration: Date.now() - startTime,
        error,
      });
      throw error;
    }
  }
}

typescript

class LoggingClient {
  async chat(prompt: string, model: string) {
    const startTime = Date.now();

    console.log('[请求]', {
      timestamp: new Date().toISOString(),
      model,
      promptLength: prompt.length,
    });

    try {
      const completion = await client.chat.completions.create({
        model,
        messages: [{ role: 'user', content: prompt }],
      });

      const duration = Date.now() - startTime;

      console.log('[响应]', {
        timestamp: new Date().toISOString(),
        duration,
        usage: completion.usage,
        finishReason: completion.choices[0].finish_reason,
      });

      return completion;
    } catch (error) {
      console.error('[错误]', {
        timestamp: new Date().toISOString(),
        duration: Date.now() - startTime,
        error,
      });
      throw error;
    }
  }
}

Best Practices

最佳实践

Model Selection:
- Use fast models (Haiku, Flash) for simple tasks
- Use flagship models (Sonnet, GPT-4) for complex reasoning
- Consider context size requirements
- Test multiple models for your use case
Cost Optimization:
- Estimate costs before requests
- Use cheaper models when possible
- Implement token limits
- Cache common responses
- Batch similar requests
Streaming:
- Always use streaming for user-facing apps
- Handle connection interruptions
- Show progress indicators
- Buffer partial responses
Error Handling:
- Implement retry logic with exponential backoff
- Use model fallbacks for reliability
- Log all errors for debugging
- Handle rate limits gracefully
Prompt Engineering:
- Use system prompts for consistency
- Implement few-shot learning for specific tasks
- Use chain-of-thought for complex reasoning
- Keep prompts concise to reduce costs
Rate Limiting:
- Respect API rate limits
- Implement request queuing
- Use exponential backoff
- Monitor usage metrics
Security:
- Never expose API keys in client code
- Use environment variables
- Implement server-side proxies
- Validate user inputs
Monitoring:
- Track token usage
- Monitor response times
- Log errors and failures
- Analyze model performance

模型选择:
- 简单任务使用快速模型（Haiku、Flash）
- 复杂推理任务使用旗舰模型（Sonnet、GPT-4）
- 考虑上下文长度需求
- 针对你的使用场景测试多个模型
成本优化:
- 请求前估算成本
- 尽可能使用便宜的模型
- 实现token限制
- 缓存常见响应
- 批量处理相似请求
流式响应:
- 用户面向的应用始终使用流式响应
- 处理连接中断情况
- 显示进度指示器
- 缓冲部分响应
错误处理:
- 实现带指数退避的重试逻辑
- 使用模型降级机制保证可靠性
- 记录所有错误用于调试
- 优雅处理速率限制
提示词工程:
- 使用系统提示词保证一致性
- 针对特定任务实现少样本学习
- 复杂推理任务使用思维链提示
- 保持提示词简洁以降低成本
速率限制:
- 遵守API速率限制
- 实现请求队列
- 使用指数退避
- 监控使用指标
安全:
- 切勿在前端代码中暴露API密钥
- 使用环境变量存储密钥
- 实现服务器端代理
- 验证用户输入
监控:
- 追踪token使用量
- 监控响应时间
- 记录错误与失败情况
- 分析模型性能

Common Pitfalls

常见陷阱

❌ Exposing API keys in frontend:

typescript

// WRONG - API key exposed
const client = new OpenAI({
  baseURL: 'https://openrouter.ai/api/v1',
  apiKey: 'sk-or-v1-...',  // Exposed!
});

✅ Correct - Server-side proxy:

typescript

// Backend proxy
app.post('/api/chat', async (req, res) => {
  const { prompt } = req.body;

  const completion = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [{ role: 'user', content: prompt }],
  });

  res.json(completion);
});

❌ Not handling streaming errors:

typescript

// WRONG - no error handling
for await (const chunk of stream) {
  console.log(chunk.choices[0].delta.content);
}

✅ Correct - with error handling:

typescript

try {
  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content || '';
    process.stdout.write(content);
  }
} catch (error) {
  console.error('Stream error:', error);
  // Implement retry or fallback
}

❌ Ignoring rate limits:

typescript

// WRONG - no rate limiting
const promises = prompts.map(prompt => chat(prompt));
await Promise.all(promises);  // May hit rate limits

✅ Correct - with rate limiting:

typescript

const results = [];
for (let i = 0; i < prompts.length; i += 5) {
  const batch = prompts.slice(i, i + 5);
  const batchResults = await Promise.all(batch.map(chat));
  results.push(...batchResults);
  await new Promise(r => setTimeout(r, 1000));  // Delay between batches
}

❌ 前端暴露API密钥:

typescript

// 错误 - API密钥被暴露
const client = new OpenAI({
  baseURL: 'https://openrouter.ai/api/v1',
  apiKey: 'sk-or-v1-...',  // 暴露了！
});

✅ 正确做法 - 服务器端代理:

typescript

// 后端代理
app.post('/api/chat', async (req, res) => {
  const { prompt } = req.body;

  const completion = await client.chat.completions.create({
    model: 'anthropic/claude-3.5-sonnet',
    messages: [{ role: 'user', content: prompt }],
  });

  res.json(completion);
});

❌ 未处理流式响应错误:

typescript

// 错误 - 无错误处理
for await (const chunk of stream) {
  console.log(chunk.choices[0].delta.content);
}

✅ 正确做法 - 包含错误处理:

typescript

try {
  for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content || '';
    process.stdout.write(content);
  }
} catch (error) {
  console.error('流式响应错误:', error);
  // 实现重试或降级逻辑
}

❌ 忽略速率限制:

typescript

// 错误 - 无速率限制
const promises = prompts.map(prompt => chat(prompt));
await Promise.all(promises);  // 可能触发速率限制

✅ 正确做法 - 加入速率限制:

typescript

const results = [];
for (let i = 0; i < prompts.length; i += 5) {
  const batch = prompts.slice(i, i + 5);
  const batchResults = await Promise.all(batch.map(chat));
  results.push(...batchResults);
  await new Promise(r => setTimeout(r, 1000));  // 批次间延迟
}

Performance Optimization

性能优化

Caching Responses

响应缓存

typescript

const responseCache = new Map<string, string>();

async function cachedChat(prompt: string, model: string) {
  const cacheKey = `${model}:${prompt}`;

  if (responseCache.has(cacheKey)) {
    console.log('Cache hit');
    return responseCache.get(cacheKey)!;
  }

  const completion = await client.chat.completions.create({
    model,
    messages: [{ role: 'user', content: prompt }],
  });

  const response = completion.choices[0].message.content || '';
  responseCache.set(cacheKey, response);

  return response;
}

typescript

const responseCache = new Map<string, string>();

async function cachedChat(prompt: string, model: string) {
  const cacheKey = `${model}:${prompt}`;

  if (responseCache.has(cacheKey)) {
    console.log('缓存命中');
    return responseCache.get(cacheKey)!;
  }

  const completion = await client.chat.completions.create({
    model,
    messages: [{ role: 'user', content: prompt }],
  });

  const response = completion.choices[0].message.content || '';
  responseCache.set(cacheKey, response);

  return response;
}

Parallel Processing

并行处理

typescript

async function parallelChat(prompts: string[], model: string) {
  const results = await Promise.all(
    prompts.map(prompt =>
      client.chat.completions.create({
        model,
        messages: [{ role: 'user', content: prompt }],
      })
    )
  );

  return results.map(r => r.choices[0].message.content);
}

typescript

async function parallelChat(prompts: string[], model: string) {
  const results = await Promise.all(
    prompts.map(prompt =>
      client.chat.completions.create({
        model,
        messages: [{ role: 'user', content: prompt }],
      })
    )
  );

  return results.map(r => r.choices[0].message.content);
}

Resources

资源

Documentation: https://openrouter.ai/docs
API Reference: https://openrouter.ai/docs/api-reference
Model List: https://openrouter.ai/models
Pricing: https://openrouter.ai/docs/pricing
Status Page: https://status.openrouter.ai

文档: https://openrouter.ai/docs
API参考: https://openrouter.ai/docs/api-reference
模型列表: https://openrouter.ai/models
定价: https://openrouter.ai/docs/pricing
状态页面: https://status.openrouter.ai

Related Skills

Summary

总结

OpenRouter provides unified access to 200+ LLMs
OpenAI-compatible API for easy migration
Cost optimization through model selection and token management
Streaming for responsive user experiences
Function calling for tool integration
Vision models for image understanding
Fallback strategies for reliability
Rate limiting and error handling essential
Perfect for multi-model apps, cost-sensitive deployments, avoiding vendor lock-in

OpenRouter 提供统一访问200+个LLM的能力
兼容OpenAI的API，便于迁移
通过模型选择与token管理实现成本优化
流式响应提供流畅的用户体验
函数调用支持工具集成
视觉模型支持图像理解
降级策略保证可靠性
速率限制与错误处理至关重要
非常适合多模型应用、对成本敏感的部署场景、避免厂商锁定