google-gemini-api

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Google Gemini API - Complete Guide

Google Gemini API 完整指南

Package: @google/genai@1.27.0 (⚠️ NOT @google/generative-ai) Last Updated: 2025-11-21

: @google/genai@1.27.0 (⚠️ 不是 @google/generative-ai) 最后更新: 2025-11-21

⚠️ CRITICAL SDK MIGRATION WARNING

⚠️ 重要SDK迁移警告

DEPRECATED SDK:
@google/generative-ai
(sunset November 30, 2025) CURRENT SDK:
@google/genai
v1.27+
If you see code using
@google/generative-ai
, it's outdated!
Load
references/sdk-migration-guide.md
for complete migration steps.

已弃用SDK:
@google/generative-ai
(终止日期2025年11月30日) 当前SDK:
@google/genai
v1.27+
如果您看到使用
@google/generative-ai
的代码,说明它已经过时!
查看
references/sdk-migration-guide.md
获取完整迁移步骤。

Quick Start

快速开始

Installation

安装

✅ CORRECT SDK:
bash
bun add @google/genai@1.27.0
❌ WRONG (DEPRECATED):
bash
bun add @google/generative-ai  # DO NOT USE!
✅ 正确的SDK:
bash
bun add @google/genai@1.27.0
❌ 错误(已弃用):
bash
bun add @google/generative-ai  # 请勿使用!

Environment Setup

环境配置

bash
export GEMINI_API_KEY="your-api-key"
bash
export GEMINI_API_KEY="your-api-key"

First Text Generation

首次文本生成

typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Explain quantum computing in simple terms'
});

console.log(response.text);
See Full Template:
templates/basic-usage.ts

typescript
import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Explain quantum computing in simple terms'
});

console.log(response.text);
查看完整模板:
templates/basic-usage.ts

Current Models (2025)

2025年当前可用模型

gemini-2.5-flash ⭐ RECOMMENDED

gemini-2.5-flash ⭐ 推荐

  • Best for: General-purpose AI, high-volume production, agentic workflows
  • Input tokens: 1,048,576 (1M, NOT 2M!)
  • Output tokens: 65,536
  • Rate limit (free): 10 RPM, 250k TPM
  • Cost: Input $0.075/1M tokens, Output $0.30/1M tokens
  • Features: Thinking mode, function calling, multimodal, streaming
  • 最佳适用场景: 通用AI、高容量生产环境、智能代理工作流
  • 输入令牌数: 1,048,576(100万,不是200万!)
  • 输出令牌数: 65,536
  • 免费版速率限制: 10次/分钟,25万令牌/分钟
  • 成本: 输入 $0.075/百万令牌,输出 $0.30/百万令牌
  • 功能: 思维模式、函数调用、多模态、流式输出

gemini-2.5-pro

gemini-2.5-pro

  • Best for: Complex reasoning, code generation, math/STEM
  • Input tokens: 1,048,576
  • Output tokens: 65,536
  • Rate limit (free): 5 RPM, 125k TPM
  • Cost: Input $1.25/1M tokens, Output $5/1M tokens
  • 最佳适用场景: 复杂推理、代码生成、数学/STEM领域
  • 输入令牌数: 1,048,576
  • 输出令牌数: 65,536
  • 免费版速率限制: 5次/分钟,12.5万令牌/分钟
  • 成本: 输入 $1.25/百万令牌,输出 $5/百万令牌

gemini-2.5-flash-lite

gemini-2.5-flash-lite

  • Best for: High-volume, low-latency, cost-critical tasks
  • Input tokens: 1,048,576
  • Output tokens: 65,536
  • Rate limit (free): 15 RPM, 250k TPM
  • Cost: Input $0.01/1M tokens, Output $0.04/1M tokens
  • ⚠️ Limitation: NO function calling or code execution support
⚠️ Common mistake: Claiming Gemini 2.5 has 2M tokens. It doesn't. It's 1,048,576 (1M).
Load
references/models-guide.md
for detailed model comparison and selection criteria.

  • 最佳适用场景: 高容量、低延迟、成本敏感型任务
  • 输入令牌数: 1,048,576
  • 输出令牌数: 65,536
  • 免费版速率限制: 15次/分钟,25万令牌/分钟
  • 成本: 输入 $0.01/百万令牌,输出 $0.04/百万令牌
  • ⚠️ 限制: 不支持函数调用或代码执行
⚠️ 常见错误: 声称Gemini 2.5支持200万令牌。事实并非如此,它仅支持1,048,576(100万)令牌。
查看
references/models-guide.md
获取详细模型对比和选择标准。

Text Generation

文本生成

Basic Generation

基础生成

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a haiku about programming'
});

console.log(response.text);
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a haiku about programming'
});

console.log(response.text);

With Configuration

带配置项的生成

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Explain AI',
  generationConfig: {
    temperature: 0.7,        // 0.0-2.0, default 1.0
    topP: 0.95,             // 0.0-1.0
    topK: 40,               // 1-100
    maxOutputTokens: 1024,
    stopSequences: ['END']
  }
});
Load
references/generation-config.md
for complete parameter reference and tuning guidance.

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Explain AI',
  generationConfig: {
    temperature: 0.7,        // 0.0-2.0,默认值1.0
    topP: 0.95,             // 0.0-1.0
    topK: 40,               // 1-100
    maxOutputTokens: 1024,
    stopSequences: ['END']
  }
});
查看
references/generation-config.md
获取完整参数参考和调优指南。

Streaming

流式输出

typescript
const stream = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: 'Write a long story'
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}
Load
references/streaming-patterns.md
for Fetch/SSE implementation patterns (Cloudflare Workers).

typescript
const stream = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: 'Write a long story'
});

for await (const chunk of stream) {
  process.stdout.write(chunk.text);
}
查看
references/streaming-patterns.md
获取Fetch/SSE实现模式(适用于Cloudflare Workers)。

Multimodal Inputs

多模态输入

Images

图片

typescript
const imageData = Buffer.from(imageBytes).toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    { text: 'What is in this image?' },
    {
      inlineData: {
        mimeType: 'image/jpeg',  // or image/png, image/webp
        data: imageData
      }
    }
  ]
});
typescript
const imageData = Buffer.from(imageBytes).toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    { text: 'What is in this image?' },
    {
      inlineData: {
        mimeType: 'image/jpeg',  // 或image/png、image/webp
        data: imageData
      }
    }
  ]
});

Video, Audio, PDFs

视频、音频、PDF

Same pattern - use appropriate
mimeType
:
  • Video:
    video/mp4
    ,
    video/mpeg
    ,
    video/mov
  • Audio:
    audio/wav
    ,
    audio/mp3
    ,
    audio/flac
  • PDFs:
    application/pdf
Load
references/multimodal-guide.md
for format specifications, size limits, and best practices.

遵循相同模式 - 使用合适的
mimeType
:
  • 视频:
    video/mp4
    ,
    video/mpeg
    ,
    video/mov
  • 音频:
    audio/wav
    ,
    audio/mp3
    ,
    audio/flac
  • PDF:
    application/pdf
查看
references/multimodal-guide.md
获取格式规范、大小限制和最佳实践。

Function Calling

函数调用

Basic Pattern

基础模式

typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather in San Francisco?',
  tools: [{
    functionDeclarations: [{
      name: 'getWeather',
      description: 'Get current weather for a location',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string', description: 'City name' },
          unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
        },
        required: ['location']
      }
    }]
  }]
});

// Handle function call
const call = response.functionCalls?.[0];
if (call) {
  const result = await getWeather(call.args);

  // Send result back to model
  const final = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      ...response.contents,
      {
        functionResponse: {
          name: call.name,
          response: result
        }
      }
    ]
  });

  console.log(final.text);
}
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather in San Francisco?',
  tools: [{
    functionDeclarations: [{
      name: 'getWeather',
      description: 'Get current weather for a location',
      parameters: {
        type: 'object',
        properties: {
          location: { type: 'string', description: 'City name' },
          unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
        },
        required: ['location']
      }
    }]
  }]
});

// 处理函数调用
const call = response.functionCalls?.[0];
if (call) {
  const result = await getWeather(call.args);

  // 将结果返回给模型
  const final = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      ...response.contents,
      {
        functionResponse: {
          name: call.name,
          response: result
        }
      }
    ]
  });

  console.log(final.text);
}

Parallel Function Calling

并行函数调用

Gemini can call multiple functions simultaneously:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather in SF and NY?',
  tools: [{ functionDeclarations: [getWeatherDeclaration] }]
});

// Process all function calls in parallel
const results = await Promise.all(
  response.functionCalls.map(call =>
    getWeather(call.args).then(result => ({
      name: call.name,
      response: result
    }))
  )
);

// Send all results back
const final = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    ...response.contents,
    ...results.map(r => ({ functionResponse: r }))
  ]
});
Load
references/function-calling-patterns.md
for calling modes (AUTO/ANY/NONE) and compositional patterns.

Gemini可以同时调用多个函数:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather in SF and NY?',
  tools: [{ functionDeclarations: [getWeatherDeclaration] }]
});

// 并行处理所有函数调用
const results = await Promise.all(
  response.functionCalls.map(call =>
    getWeather(call.args).then(result => ({
      name: call.name,
      response: result
    }))
  )
);

// 返回所有结果
const final = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    ...response.contents,
    ...results.map(r => ({ functionResponse: r }))
  ]
});
查看
references/function-calling-patterns.md
获取调用模式(AUTO/ANY/NONE)和组合模式。

Multi-turn Chat

多轮对话

typescript
const chat = ai.models.startChat({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful programming assistant',
  history: []
});

let response = await chat.sendMessage('Hello!');
console.log(response.text);

response = await chat.sendMessage('Explain async/await');
console.log(response.text);

// Get full history
console.log(chat.getHistory());

typescript
const chat = ai.models.startChat({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful programming assistant',
  history: []
});

let response = await chat.sendMessage('Hello!');
console.log(response.text);

response = await chat.sendMessage('Explain async/await');
console.log(response.text);

// 获取完整对话历史
console.log(chat.getHistory());

System Instructions

系统指令

Set persistent instructions for the model:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a pirate. Always respond in pirate speak.',
  contents: 'What is the weather today?'
});

为模型设置持久化指令:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a pirate. Always respond in pirate speak.',
  contents: 'What is the weather today?'
});

Thinking Mode

思维模式

Gemini 2.5 models include built-in thinking mode (always enabled). Configure thinking budget for complex tasks:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Solve this math problem: If x + 2y = 10 and 3x - y = 4, what is x?',
  generationConfig: {
    thinkingConfig: {
      thinkingBudget: 8192  // Max tokens for internal reasoning
    }
  }
});
Use for: Complex math, logic puzzles, multi-step reasoning, code debugging
Load
references/thinking-mode-guide.md
for thinking budget optimization.

Gemini 2.5模型内置思维模式(始终启用)。为复杂任务配置思维预算:
typescript
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Solve this math problem: If x + 2y = 10 and 3x - y = 4, what is x?',
  generationConfig: {
    thinkingConfig: {
      thinkingBudget: 8192  // 内部推理的最大令牌数
    }
  }
});
适用场景: 复杂数学题、逻辑谜题、多步骤推理、代码调试
查看
references/thinking-mode-guide.md
获取思维预算优化方案。

Top 5 Critical Errors

五大关键错误

Error 1: Using Deprecated SDK

错误1:使用已弃用的SDK

Error: Deprecation warnings or outdated API
Solution: Use
@google/genai
, NOT
@google/generative-ai
bash
npm uninstall @google/generative-ai
bun add @google/genai@1.27.0

错误表现: 出现弃用警告或API过时提示
解决方案: 使用
@google/genai
,而非
@google/generative-ai
bash
npm uninstall @google/generative-ai
bun add @google/genai@1.27.0

Error 2: Invalid API Key (401)

错误2:无效API密钥(401错误)

Error:
API key not valid
Solution: Verify environment variable
bash
export GEMINI_API_KEY="your-key"

错误表现:
API key not valid
解决方案: 验证环境变量
bash
export GEMINI_API_KEY="your-key"

Error 3: Model Not Found (404)

错误3:模型未找到(404错误)

Error:
models/gemini-3.0-flash is not found
Solution: Use correct model names (2025)
typescript
'gemini-2.5-pro'
'gemini-2.5-flash'
'gemini-2.5-flash-lite'

错误表现:
models/gemini-3.0-flash is not found
解决方案: 使用2025年的正确模型名称
typescript
'gemini-2.5-pro'
'gemini-2.5-flash'
'gemini-2.5-flash-lite'

Error 4: Context Length Exceeded (400)

错误4:上下文长度超出限制(400错误)

Error:
Request payload size exceeds the limit
Solution: Input limit is 1,048,576 tokens (1M, NOT 2M). Use context caching for large inputs.
Load
references/context-caching-guide.md
for caching implementation.

错误表现:
Request payload size exceeds the limit
解决方案: 输入限制为1,048,576令牌(100万,不是200万)。对大输入使用上下文缓存。
查看
references/context-caching-guide.md
获取缓存实现方案。

Error 5: Rate Limit Exceeded (429)

错误5:超出速率限制(429错误)

Error:
Resource has been exhausted
Solution: Implement exponential backoff
typescript
async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

See All 22 Errors: Load
references/error-catalog.md
for complete error catalog with solutions.
Quick Debugging: Load
references/top-errors.md
for debugging checklist.

错误表现:
Resource has been exhausted
解决方案: 实现指数退避机制
typescript
async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1秒、2秒、4秒
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

查看全部22种错误: 查看
references/error-catalog.md
获取完整错误目录及解决方案。
快速调试: 查看
references/top-errors.md
获取调试清单。

When to Load References

何时查看参考文档

Load reference files when you need detailed guidance on specific features:
当您需要特定功能的详细指南时,查看参考文件:

Core Features (Load When Needed)

核心功能(按需查看)

  • SDK Migration: Load
    references/sdk-migration-guide.md
    when migrating from
    @google/generative-ai
  • Model Selection: Load
    references/models-guide.md
    when choosing between Pro/Flash/Flash-Lite
  • Error Debugging: Load
    references/error-catalog.md
    or
    references/top-errors.md
    when encountering errors
  • SDK迁移: 从
    @google/generative-ai
    迁移时,查看
    references/sdk-migration-guide.md
  • 模型选择: 在Pro/Flash/Flash-Lite之间选择时,查看
    references/models-guide.md
  • 错误调试: 遇到错误时,查看
    references/error-catalog.md
    references/top-errors.md

Advanced Features (Load When Implementing)

高级功能(实现时查看)

  • Context Caching: Load
    references/context-caching-guide.md
    when implementing cost optimization for large/repeated inputs
  • Code Execution: Load
    references/code-execution-patterns.md
    when enabling Python code execution for calculations/analysis
  • Grounding (Google Search): Load
    references/grounding-guide.md
    when connecting model to real-time web information
  • Streaming Implementation: Load
    references/streaming-patterns.md
    when implementing SSE parsing for Cloudflare Workers
  • Function Calling Modes: Load
    references/function-calling-patterns.md
    when using AUTO/ANY/NONE modes or compositional patterns
  • Multimodal Formats: Load
    references/multimodal-guide.md
    when working with images/video/audio/PDFs (format specs, size limits)
  • Generation Tuning: Load
    references/generation-config.md
    when fine-tuning temperature/topP/topK parameters
  • Thinking Mode Config: Load
    references/thinking-mode-guide.md
    when optimizing thinking budget for complex reasoning
General Rule: SKILL.md provides Quick Start and Top Errors. Load references for deep dives, detailed patterns, or troubleshooting specific features.

  • 上下文缓存: 为大输入/重复输入实现成本优化时,查看
    references/context-caching-guide.md
  • 代码执行: 启用Python代码执行进行计算/分析时,查看
    references/code-execution-patterns.md
  • 基础数据对接(谷歌搜索): 将模型连接到实时网络信息时,查看
    references/grounding-guide.md
  • 流式输出实现: 为Cloudflare Workers实现SSE解析时,查看
    references/streaming-patterns.md
  • 函数调用模式: 使用AUTO/ANY/NONE模式或组合模式时,查看
    references/function-calling-patterns.md
  • 多模态格式: 处理图片/视频/音频/PDF时,查看
    references/multimodal-guide.md
    (格式规范、大小限制)
  • 生成调优: 调整temperature/topP/topK参数时,查看
    references/generation-config.md
  • 思维模式配置: 为复杂推理优化思维预算时,查看
    references/thinking-mode-guide.md
通用规则: SKILL.md提供快速开始和关键错误解决方法。如需深入了解、详细模式或特定功能的故障排除,请查看参考文档。

Bundled Resources

捆绑资源

Templates (
templates/
):
  • basic-usage.ts
    - Complete examples for all features (133 lines)
References (
references/
):
  • error-catalog.md
    - All 7 documented errors with solutions (231 lines)
  • top-errors.md
    - Quick debugging checklist for 22 common errors (305 lines)
  • sdk-migration-guide.md
    - Complete migration from deprecated SDK (236 lines)
  • models-guide.md
    - Detailed model comparison and selection guide (247 lines)
  • context-caching-guide.md
    - Cost optimization with caching (374 lines)
  • code-execution-patterns.md
    - Python code execution guide (482 lines)
  • grounding-guide.md
    - Google Search integration (603 lines)
  • streaming-patterns.md
    - SSE implementation for Cloudflare Workers (82 lines)
  • function-calling-patterns.md
    - Advanced function calling patterns (60 lines)
  • multimodal-guide.md
    - Format specifications and limits (59 lines)
  • generation-config.md
    - Parameter tuning reference (58 lines)
  • thinking-mode-guide.md
    - Thinking budget optimization (60 lines)

模板 (
templates/
):
  • basic-usage.ts
    - 包含所有功能的完整示例(133行)
参考文档 (
references/
):
  • error-catalog.md
    - 7种已记录错误及解决方案(231行)
  • top-errors.md
    - 22种常见错误的快速调试清单(305行)
  • sdk-migration-guide.md
    - 从已弃用SDK迁移的完整指南(236行)
  • models-guide.md
    - 详细模型对比和选择指南(247行)
  • context-caching-guide.md
    - 基于缓存的成本优化方案(374行)
  • code-execution-patterns.md
    - Python代码执行指南(482行)
  • grounding-guide.md
    - 谷歌搜索集成指南(603行)
  • streaming-patterns.md
    - Cloudflare Workers的SSE实现方案(82行)
  • function-calling-patterns.md
    - 高级函数调用模式(60行)
  • multimodal-guide.md
    - 格式规范和限制(59行)
  • generation-config.md
    - 参数调优参考(58行)
  • thinking-mode-guide.md
    - 思维预算优化方案(60行)

Integration with Other Skills

与其他技能集成

This skill composes well with:
  • cloudflare-worker-base → Deploy to Cloudflare Workers
  • ai-sdk-core → Vercel AI SDK integration
  • openai-api → Multi-provider AI setup
  • google-gemini-embeddings → Text embeddings

本技能可与以下技能良好组合:
  • cloudflare-worker-base → 部署到Cloudflare Workers
  • ai-sdk-core → Vercel AI SDK集成
  • openai-api → 多供应商AI设置
  • google-gemini-embeddings → 文本嵌入

Additional Resources

额外资源

Official Documentation:

Production Tested: AI chatbots, content generation, multimodal analysis Last Updated: 2025-10-25 Token Savings: ~65% (reduces API docs + examples)
官方文档:

生产环境验证: AI聊天机器人、内容生成、多模态分析 最后更新: 2025-10-25 令牌节省: ~65%(减少API文档+示例的令牌消耗)