google-gemini-api

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Google Gemini API - Complete Guide

Google Gemini API - 完整指南

Version: 3.0.0 (14 Known Issues Added) Package: @google/genai@1.35.0 (⚠️ NOT @google/generative-ai) Last Updated: 2026-01-21

版本: 3.0.0（新增14个已知问题）包: @google/genai@1.35.0（⚠️ 请勿使用@google/generative-ai） 最后更新: 2026-01-21

⚠️ CRITICAL SDK MIGRATION WARNING

⚠️ 重要SDK迁移警告

DEPRECATED SDK:

@google/generative-ai

(sunset November 30, 2025) CURRENT SDK:

@google/genai

v1.27+

If you see code using
@google/generative-ai
, it's outdated!

This skill uses the correct current SDK and provides a complete migration guide.

已弃用SDK:

@google/generative-ai

（2025年11月30日停止服务） 当前SDK:

@google/genai

v1.27+

如果您看到使用
@google/generative-ai
的代码，说明它已过时！

本指南使用正确的当前SDK，并提供完整的迁移指南。

Status

状态

✅ Phase 1 Complete:

✅ Text Generation (basic + streaming)
✅ Multimodal Inputs (images, video, audio, PDFs)
✅ Function Calling (basic + parallel execution)
✅ System Instructions & Multi-turn Chat
✅ Thinking Mode Configuration
✅ Generation Parameters (temperature, top-p, top-k, stop sequences)
✅ Both Node.js SDK (@google/genai) and fetch approaches

✅ Phase 2 Complete:

✅ Context Caching (cost optimization with TTL-based caching)
✅ Code Execution (built-in Python interpreter and sandbox)
✅ Grounding with Google Search (real-time web information + citations)

📦 Separate Skills:

Embeddings: See
```
google-gemini-embeddings
```
skill for text-embedding-004

✅ 第一阶段完成:

✅ 文本生成（基础版+流式输出）
✅ 多模态输入（图片、视频、音频、PDF）
✅ 函数调用（基础版+并行执行）
✅ 系统指令与多轮对话
✅ 思考模式配置
✅ 生成参数（temperature、top-p、top-k、停止序列）
✅ Node.js SDK（@google/genai）和Fetch两种实现方式

✅ 第二阶段完成:

✅ 上下文缓存（基于TTL的缓存优化成本）
✅ 代码执行（内置Python解释器和沙箱）
✅ 基于Google搜索的事实校验（实时网络信息+引用）

📦 独立技能:

嵌入: 文本嵌入功能请查看
```
google-gemini-embeddings
```
技能（对应text-embedding-004模型）

Quick Start

快速开始

Installation

安装

CORRECT SDK:

bash

npm install @google/genai@1.34.0

❌ WRONG (DEPRECATED):

bash

npm install @google/generative-ai  # DO NOT USE!

正确的SDK:

bash

npm install @google/genai@1.34.0

❌ 错误（已弃用）:

bash

npm install @google/generative-ai  # 请勿使用！

Environment Setup

环境配置

bash

export GEMINI_API_KEY="..."

Or create

.env

file:

GEMINI_API_KEY=...

bash

export GEMINI_API_KEY="..."

或创建

.env

文件:

GEMINI_API_KEY=...

First Text Generation (Node.js SDK)

首次文本生成（Node.js SDK）

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Explain quantum computing in simple terms'
});

console.log(response.text);

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '用简单的语言解释量子计算'
});

console.log(response.text);

First Text Generation (Fetch - Cloudflare Workers)

首次文本生成（Fetch - Cloudflare Workers）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Explain quantum computing in simple terms' }] }]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '用简单的语言解释量子计算' }] }]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Current Models (2025)

当前模型（2025）

Gemini 3 Series (December 2025)

Gemini 3系列（2025年12月）

gemini-3-flash

Context: 1,048,576 input tokens / 65,536 output tokens
Status: 🆕 Generally Available (December 2025)
Description: Google's fastest and most efficient Gemini 3 model for production workloads
Best for: High-throughput applications, low-latency responses, cost-sensitive production
Features: Enhanced multimodal, function calling, streaming, thinking mode
Benchmark Performance: Matches gemini-2.5-pro quality at gemini-2.5-flash speed/cost
Recommended for: Production use cases requiring speed + quality balance

上下文窗口: 1,048,576输入token / 65,536输出token
状态: 🆕 正式可用（2025年12月）
描述: Google最快、最高效的Gemini 3模型，适用于生产工作负载
最佳适用场景: 高吞吐量应用、低延迟响应、对成本敏感的生产环境
特性: 增强型多模态、函数调用、流式输出、思考模式
基准性能: 达到gemini-2.5-pro的质量，同时拥有gemini-2.5-flash的速度和成本优势
推荐: 需要速度与质量平衡的生产场景

gemini-3-pro-preview

Context: TBD (documentation pending)
Status: Preview release (November 18, 2025)
Description: Google's newest and most intelligent AI model with state-of-the-art reasoning
Best for: Most complex reasoning tasks, advanced multimodal understanding, benchmark-critical applications
Features: Enhanced multimodal (text, image, video, audio, PDF), function calling, streaming
Benchmark Performance: Outperforms Gemini 2.5 Pro on every major AI benchmark
⚠️ Preview Models Warning: Preview models have NO SLAs and can change or be deprecated with little notice. Use GA (generally available) models for production. See Issue #13

上下文窗口: 待定（文档未发布）
状态: 预览版（2025年11月18日发布）
描述: Google最新、最智能的AI模型，具备最先进的推理能力
最佳适用场景: 最复杂的推理任务、高级多模态理解、对基准要求严格的应用
特性: 增强型多模态（文本、图片、视频、音频、PDF）、函数调用、流式输出
基准性能: 在所有主要AI基准测试中优于Gemini 2.5 Pro
⚠️ 预览模型警告: 预览模型无服务级别协议（SLA），可能随时变更或弃用。生产环境请使用正式可用（GA）模型。详情请查看问题#13

Gemini 2.5 Series (General Availability - Stable)

Gemini 2.5系列（正式可用 - 稳定版）

gemini-2.5-pro

Context: 1,048,576 input tokens / 65,536 output tokens
Description: State-of-the-art thinking model for complex reasoning
Best for: Code, math, STEM, complex problem-solving
Features: Thinking mode (default on), function calling, multimodal, streaming
Knowledge cutoff: January 2025

上下文窗口: 1,048,576输入token / 65,536输出token
描述: 具备最先进思考能力的模型，适用于复杂推理
最佳适用场景: 代码、数学、STEM、复杂问题解决
特性: 思考模式（默认开启）、函数调用、多模态、流式输出
知识截止日期: 2025年1月

gemini-2.5-flash

Context: 1,048,576 input tokens / 65,536 output tokens
Description: Best price-performance workhorse model
Best for: Large-scale processing, low-latency, high-volume, agentic use cases
Features: Thinking mode (default on), function calling, multimodal, streaming
Knowledge cutoff: January 2025

上下文窗口: 1,048,576输入token / 65,536输出token
描述: 性价比最高的主力模型
最佳适用场景: 大规模处理、低延迟、高吞吐量、智能代理场景
特性: 思考模式（默认开启）、函数调用、多模态、流式输出
知识截止日期: 2025年1月

gemini-2.5-flash-lite

Context: 1,048,576 input tokens / 65,536 output tokens
Description: Cost-optimized, fastest 2.5 model
Best for: High throughput, cost-sensitive applications
Features: Thinking mode (default on), function calling, multimodal, streaming
Knowledge cutoff: January 2025

上下文窗口: 1,048,576输入token / 65,536输出token
描述: 成本优化的最快2.5系列模型
最佳适用场景: 高吞吐量、对成本敏感的应用
特性: 思考模式（默认开启）、函数调用、多模态、流式输出
知识截止日期: 2025年1月

Model Feature Matrix

模型特性矩阵

Feature	3-Flash	3-Pro (Preview)	2.5-Pro	2.5-Flash	2.5-Flash-Lite
Thinking Mode	✅ Default ON	TBD	✅ Default ON	✅ Default ON	✅ Default ON
Function Calling	✅	✅	✅	✅	✅
Multimodal	✅ Enhanced	✅ Enhanced	✅	✅	✅
Streaming	✅	✅	✅	✅	✅
System Instructions	✅	✅	✅	✅	✅
Context Window	1,048,576 in	TBD	1,048,576 in	1,048,576 in	1,048,576 in
Output Tokens	65,536 max	TBD	65,536 max	65,536 max	65,536 max
Status	GA	Preview	Stable	Stable	Stable

特性	3-Flash	3-Pro（预览版）	2.5-Pro	2.5-Flash	2.5-Flash-Lite
思考模式	✅ 默认开启	待定	✅ 默认开启	✅ 默认开启	✅ 默认开启
函数调用	✅	✅	✅	✅	✅
多模态	✅ 增强型	✅ 增强型	✅	✅	✅
流式输出	✅	✅	✅	✅	✅
系统指令	✅	✅	✅	✅	✅
输入上下文窗口	1,048,576	待定	1,048,576	1,048,576	1,048,576
最大输出token	65,536	待定	65,536	65,536	65,536
状态	正式可用	预览版	稳定版	稳定版	稳定版

⚠️ Context Window Correction

⚠️ 上下文窗口纠正

ACCURATE (Gemini 2.5): Gemini 2.5 models support 1,048,576 input tokens (NOT 2M!) OUTDATED: Only Gemini 1.5 Pro (previous generation) had 2M token context window GEMINI 3: Context window specifications pending official documentation

Common mistake: Claiming Gemini 2.5 has 2M tokens. It doesn't. This skill prevents this error.

准确信息（Gemini 2.5）: Gemini 2.5模型支持1,048,576输入token（不是200万！） 过时信息: 只有上一代的Gemini 1.5 Pro拥有200万token上下文窗口 GEMINI 3: 上下文窗口规格等待官方文档发布

常见错误: 声称Gemini 2.5拥有200万token上下文窗口，实际并非如此。本指南可避免此错误。

SDK vs Fetch Approaches

SDK vs Fetch实现方式

Node.js SDK (@google/genai)

Node.js SDK（@google/genai）

Pros:

Type-safe with TypeScript
Easier API (simpler syntax)
Built-in chat helpers
Automatic SSE parsing for streaming
Better error handling

Cons:

Requires Node.js or compatible runtime
Larger bundle size
May not work in all edge runtimes

Use when: Building Node.js apps, Next.js Server Actions/Components, or any environment with Node.js compatibility

优点:

TypeScript类型安全
API更易用（语法更简洁）
内置对话助手
自动解析流式输出的SSE
更好的错误处理

缺点:

需要Node.js或兼容运行时
包体积更大
可能无法在所有边缘运行时工作

适用场景: 构建Node.js应用、Next.js Server Actions/Components，或任何兼容Node.js的环境

Fetch-based (Direct REST API)

基于Fetch的实现（直接调用REST API）

Pros:

Works in any JavaScript environment (Cloudflare Workers, Deno, Bun, browsers)
Minimal dependencies
Smaller bundle size
Full control over requests

Cons:

More verbose syntax
Manual SSE parsing for streaming
No built-in chat helpers
Manual error handling

Use when: Deploying to Cloudflare Workers, browser clients, or lightweight edge runtimes

优点:

可在任何JavaScript环境中运行（Cloudflare Workers、Deno、Bun、浏览器）
依赖极少
包体积更小
完全控制请求过程

缺点:

语法更冗长
需要手动解析流式输出的SSE
无内置对话助手
需要手动处理错误

适用场景: 部署到Cloudflare Workers、浏览器客户端，或轻量级边缘运行时

Text Generation

文本生成

Basic Text Generation (SDK)

基础文本生成（SDK）

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a haiku about artificial intelligence'
});

console.log(response.text);

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '写一首关于人工智能的俳句'
});

console.log(response.text);

Basic Text Generation (Fetch)

基础文本生成（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: 'Write a haiku about artificial intelligence' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: '写一首关于人工智能的俳句' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Response Structure

响应结构

typescript

{
  text: string,                  // Convenience accessor for text content
  candidates: [
    {
      content: {
        parts: [
          { text: string }       // Generated text
        ],
        role: string             // "model"
      },
      finishReason: string,      // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
      index: number
    }
  ],
  usageMetadata: {
    promptTokenCount: number,
    candidatesTokenCount: number,
    totalTokenCount: number
  }
}

typescript

{
  text: string,                  // 文本内容的便捷访问器
  candidates: [
    {
      content: {
        parts: [
          { text: string }       // 生成的文本
        ],
        role: string             // "model"
      },
      finishReason: string,      // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
      index: number
    }
  ],
  usageMetadata: {
    promptTokenCount: number,
    candidatesTokenCount: number,
    totalTokenCount: number
  }
}

Streaming

流式输出

Streaming with SDK (Async Iteration)

使用SDK的流式输出（异步迭代）

typescript

const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: 'Write a 200-word story about time travel'
});

for await (const chunk of response) {
  process.stdout.write(chunk.text);
}

typescript

const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: '写一个200字左右的时间旅行故事'
});

for await (const chunk of response) {
  process.stdout.write(chunk.text);
}

Streaming with Fetch (SSE Parsing)

使用Fetch的流式输出（SSE解析）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Write a 200-word story about time travel' }] }]
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
    if (!line.startsWith('data: ')) continue;

    try {
      const data = JSON.parse(line.slice(6));
      const text = data.candidates[0]?.content?.parts[0]?.text;
      if (text) {
        process.stdout.write(text);
      }
    } catch (e) {
      // Skip invalid JSON
    }
  }
}

Key Points:

Use
```
streamGenerateContent
```
endpoint (not
```
generateContent
```
)
Parse Server-Sent Events (SSE) format:
```
data: {json}\n\n
```
Handle incomplete chunks in buffer
Skip empty lines and
```
[DONE]
```
markers

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '写一个200字左右的时间旅行故事' }] }]
    }),
  }
);

const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');
  buffer = lines.pop() || '';

  for (const line of lines) {
    if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
    if (!line.startsWith('data: ')) continue;

    try {
      const data = JSON.parse(line.slice(6));
      const text = data.candidates[0]?.content?.parts[0]?.text;
      if (text) {
        process.stdout.write(text);
      }
    } catch (e) {
      // 跳过无效JSON
    }
  }
}

关键点:

使用
```
streamGenerateContent
```
端点（而非
```
generateContent
```
）
解析Server-Sent Events（SSE）格式:
```
data: {json}\n\n
```
处理缓冲区中的不完整块
跳过空行和
```
[DONE]
```
标记

Multimodal Inputs

多模态输入

Gemini 2.5 models support text + images + video + audio + PDFs in the same request.

Gemini 2.5模型支持在同一个请求中混合文本+图片+视频+音频+PDF。

Images (Vision)

图片（视觉）

SDK Approach

SDK实现方式

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// From file
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'What is in this image?' },
        {
          inlineData: {
            data: base64Image,
            mimeType: 'image/jpeg'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 从文件读取
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '这张图片里有什么？' },
        {
          inlineData: {
            data: base64Image,
            mimeType: 'image/jpeg'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Fetch Approach

Fetch实现方式

typescript

const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: 'What is in this image?' },
            {
              inlineData: {
                data: base64Image,
                mimeType: 'image/jpeg'
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Supported Image Formats:

JPEG (
```
.jpg
```
,
```
.jpeg
```
)
PNG (
```
.png
```
)
WebP (
```
.webp
```
)
HEIC (
```
.heic
```
)
HEIF (
```
.heif
```
)

Max Image Size: 20MB per image

typescript

const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          parts: [
            { text: '这张图片里有什么？' },
            {
              inlineData: {
                data: base64Image,
                mimeType: 'image/jpeg'
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

支持的图片格式:

JPEG（
```
.jpg
```
,
```
.jpeg
```
）
PNG（
```
.png
```
）
WebP（
```
.webp
```
）
HEIC（
```
.heic
```
）
HEIF（
```
.heif
```
）

单张图片最大尺寸: 20MB

Video

视频

typescript

// Video must be < 2 minutes for inline data
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Describe what happens in this video' },
        {
          inlineData: {
            data: base64Video,
            mimeType: 'video/mp4'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Supported Video Formats:

MP4 (
```
.mp4
```
)
MPEG (
```
.mpeg
```
)
MOV (
```
.mov
```
)
AVI (
```
.avi
```
)
FLV (
```
.flv
```
)
MPG (
```
.mpg
```
)
WebM (
```
.webm
```
)
WMV (
```
.wmv
```
)

Max Video Length (inline): 2 minutes Max Video Size: 2GB (use File API for larger files - Phase 2)

typescript

// 内联数据的视频时长必须<2分钟
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '描述这个视频里发生了什么' },
        {
          inlineData: {
            data: base64Video,
            mimeType: 'video/mp4'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

支持的视频格式:

MP4（
```
.mp4
```
）
MPEG（
```
.mpeg
```
）
MOV（
```
.mov
```
）
AVI（
```
.avi
```
）
FLV（
```
.flv
```
）
MPG（
```
.mpg
```
）
WebM（
```
.webm
```
）
WMV（
```
.wmv
```
）

内联视频最大时长: 2分钟 视频最大尺寸: 2GB（更大文件请使用File API - 第二阶段功能）

Audio

音频

typescript

const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Transcribe and summarize this audio' },
        {
          inlineData: {
            data: base64Audio,
            mimeType: 'audio/mp3'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Supported Audio Formats:

MP3 (
```
.mp3
```
)
WAV (
```
.wav
```
)
FLAC (
```
.flac
```
)
AAC (
```
.aac
```
)
OGG (
```
.ogg
```
)
OPUS (
```
.opus
```
)

Max Audio Size: 20MB

typescript

const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '转录并总结这段音频' },
        {
          inlineData: {
            data: base64Audio,
            mimeType: 'audio/mp3'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

支持的音频格式:

MP3（
```
.mp3
```
）
WAV（
```
.wav
```
）
FLAC（
```
.flac
```
）
AAC（
```
.aac
```
）
OGG（
```
.ogg
```
）
OPUS（
```
.opus
```
）

音频最大尺寸: 20MB

PDFs

PDF

typescript

const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Summarize the key points in this PDF' },
        {
          inlineData: {
            data: base64Pdf,
            mimeType: 'application/pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

Max PDF Size: 30MB PDF Limitations: Text-based PDFs work best; scanned images may have lower accuracy

typescript

const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '总结这份PDF的关键点' },
        {
          inlineData: {
            data: base64Pdf,
            mimeType: 'application/pdf'
          }
        }
      ]
    }
  ]
});

console.log(response.text);

PDF最大尺寸: 30MB PDF限制: 基于文本的PDF效果最佳；扫描件图片的识别准确率可能较低

Multiple Inputs

多输入混合

You can combine multiple modalities in one request:

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: 'Compare these two images and describe the differences:' },
        { inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
        { inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
      ]
    }
  ]
});

您可以在一个请求中组合多种模态:

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: [
    {
      parts: [
        { text: '对比这两张图片，描述它们的区别:' },
        { inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
        { inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
      ]
    }
  ]
});

Function Calling

函数调用

Gemini supports function calling (tool use) to connect models with external APIs and systems.

Gemini支持函数调用（工具使用），将模型与外部API和系统连接。

Basic Function Calling (SDK)

基础函数调用（SDK）

typescript

import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Define function declarations
const getCurrentWeather = {
  name: 'get_current_weather',
  description: 'Get the current weather for a location',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: 'City name, e.g. San Francisco'
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit']
      }
    },
    required: ['location']
  }
};

// Make request with tools
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What\'s the weather in Tokyo?',
  config: {
    tools: [
      { functionDeclarations: [getCurrentWeather] }
    ]
  }
});

// Check if model wants to call a function
const functionCall = response.candidates[0].content.parts[0].functionCall;

if (functionCall) {
  console.log('Function to call:', functionCall.name);
  console.log('Arguments:', functionCall.args);

  // Execute the function (your implementation)
  const weatherData = await fetchWeather(functionCall.args.location);

  // Send function result back to model
  const finalResponse = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      'What\'s the weather in Tokyo?',
      response.candidates[0].content, // Original assistant response with function call
      {
        parts: [
          {
            functionResponse: {
              name: functionCall.name,
              response: weatherData
            }
          }
        ]
      }
    ],
    config: {
      tools: [
        { functionDeclarations: [getCurrentWeather] }
      ]
    }
  });

  console.log(finalResponse.text);
}

typescript

import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 定义函数声明
const getCurrentWeather = {
  name: 'get_current_weather',
  description: '获取指定地点的当前天气',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: {
        type: 'string',
        description: '城市名称，例如：旧金山'
      },
      unit: {
        type: 'string',
        enum: ['celsius', 'fahrenheit']
      }
    },
    required: ['location']
  }
};

// 携带工具发起请求
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '东京的天气怎么样？',
  config: {
    tools: [
      { functionDeclarations: [getCurrentWeather] }
    ]
  }
});

// 检查模型是否需要调用函数
const functionCall = response.candidates[0].content.parts[0].functionCall;

if (functionCall) {
  console.log('需要调用的函数:', functionCall.name);
  console.log('参数:', functionCall.args);

  // 执行函数（您的实现逻辑）
  const weatherData = await fetchWeather(functionCall.args.location);

  // 将函数结果返回给模型
  const finalResponse = await ai.models.generateContent({
    model: 'gemini-2.5-flash',
    contents: [
      '东京的天气怎么样？',
      response.candidates[0].content, // 原始助手响应（包含函数调用）
      {
        parts: [
          {
            functionResponse: {
              name: functionCall.name,
              response: weatherData
            }
          }
        ]
      }
    ],
    config: {
      tools: [
        { functionDeclarations: [getCurrentWeather] }
      ]
    }
  });

  console.log(finalResponse.text);
}

Function Calling (Fetch)

函数调用（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'What\'s the weather in Tokyo?' }] }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_current_weather',
              description: 'Get the current weather for a location',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: 'City name'
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;

if (functionCall) {
  // Execute function and send result back (same flow as SDK)
}

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '东京的天气怎么样？' }] }
      ],
      tools: [
        {
          functionDeclarations: [
            {
              name: 'get_current_weather',
              description: '获取指定地点的当前天气',
              parameters: {
                type: 'object',
                properties: {
                  location: {
                    type: 'string',
                    description: '城市名称'
                  }
                },
                required: ['location']
              }
            }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;

if (functionCall) {
  // 执行函数并返回结果（流程与SDK相同）
}

Parallel Function Calling

并行函数调用

Gemini can call multiple independent functions simultaneously:

typescript

const tools = [
  {
    functionDeclarations: [
      {
        name: 'get_weather',
        description: 'Get weather for a location',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      },
      {
        name: 'get_population',
        description: 'Get population of a city',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            city: { type: 'string' }
          },
          required: ['city']
        }
      }
    ]
  }
];

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather and population of Tokyo?',
  config: { tools }
});

// Model may return MULTIPLE function calls in parallel
const functionCalls = response.candidates[0].content.parts.filter(
  part => part.functionCall
);

console.log(`Model wants to call ${functionCalls.length} functions in parallel`);

Gemini可以同时调用多个独立函数:

typescript

const tools = [
  {
    functionDeclarations: [
      {
        name: 'get_weather',
        description: '获取指定地点的天气',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            location: { type: 'string' }
          },
          required: ['location']
        }
      },
      {
        name: 'get_population',
        description: '获取指定城市的人口',
        parametersJsonSchema: {
          type: 'object',
          properties: {
            city: { type: 'string' }
          },
          required: ['city']
        }
      }
    ]
  }
];

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '东京的天气和人口分别是多少？',
  config: { tools }
});

// 模型可能返回多个并行的函数调用
const functionCalls = response.candidates[0].content.parts.filter(
  part => part.functionCall
);

console.log(`模型需要并行调用${functionCalls.length}个函数`);

Function Calling Modes

函数调用模式

typescript

import { FunctionCallingConfigMode } from '@google/genai';

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What\'s the weather?',
  config: {
    tools: [{ functionDeclarations: [getCurrentWeather] }],
    toolConfig: {
      functionCallingConfig: {
        mode: FunctionCallingConfigMode.ANY, // Force function call
        // mode: FunctionCallingConfigMode.AUTO, // Model decides (default)
        // mode: FunctionCallingConfigMode.NONE, // Never call functions
        allowedFunctionNames: ['get_current_weather'] // Optional: restrict to specific functions
      }
    }
  }
});

Modes:

```
AUTO
```
(default): Model decides whether to call functions
```
ANY
```
: Force model to call at least one function
```
NONE
```
: Disable function calling for this request

typescript

import { FunctionCallingConfigMode } from '@google/genai';

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '天气怎么样？',
  config: {
    tools: [{ functionDeclarations: [getCurrentWeather] }],
    toolConfig: {
      functionCallingConfig: {
        mode: FunctionCallingConfigMode.ANY, // 强制调用函数
        // mode: FunctionCallingConfigMode.AUTO, // 模型自主决定（默认）
        // mode: FunctionCallingConfigMode.NONE, // 禁止调用函数
        allowedFunctionNames: ['get_current_weather'] // 可选：限制只能调用特定函数
      }
    }
  }
});

模式说明:

```
AUTO
```
（默认）: 模型决定是否调用函数
```
ANY
```
: 强制模型至少调用一个函数
```
NONE
```
: 本次请求禁止调用函数

System Instructions

系统指令

System instructions guide the model's behavior and set context. They are separate from the conversation messages.

系统指令用于引导模型行为并设置上下文，与对话消息分开。

SDK Approach

SDK实现方式

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful AI assistant that always responds in the style of a pirate. Use nautical terminology and end sentences with "arrr".',
  contents: 'Explain what a database is'
});

console.log(response.text);
// Output: "Ahoy there! A database be like a treasure chest..."

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  systemInstruction: '你是一个乐于助人的AI助手，说话风格要像海盗，使用航海术语，句子结尾要加"arrr"。',
  contents: '解释什么是数据库'
});

console.log(response.text);
// 输出: "Ahoy there! A database be like a treasure chest... arrr"

Fetch Approach

Fetch实现方式

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      systemInstruction: {
        parts: [
          { text: 'You are a helpful AI assistant that always responds in the style of a pirate.' }
        ]
      },
      contents: [
        { parts: [{ text: 'Explain what a database is' }] }
      ]
    }),
  }
);

Key Points:

System instructions are NOT part of
```
contents
```
array
They are set once at the top level of the request
They persist for the entire conversation (when using multi-turn chat)
They don't count as user or model messages

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      systemInstruction: {
        parts: [
          { text: '你是一个乐于助人的AI助手，说话风格要像海盗。' }
        ]
      },
      contents: [
        { parts: [{ text: '解释什么是数据库' }] }
      ]
    }),
  }
);

关键点:

系统指令不属于
```
contents
```
数组
系统指令设置在请求的顶层
在多轮对话中，系统指令会持续生效
系统指令不计入用户或模型消息

Multi-turn Chat

多轮对话

For conversations with history, use the SDK's chat helpers or manually manage conversation state.

对于需要上下文的对话，使用SDK的对话助手或手动管理对话状态。

SDK Chat Helpers (Recommended)

SDK对话助手（推荐）

typescript

const chat = await ai.models.createChat({
  model: 'gemini-2.5-flash',
  systemInstruction: 'You are a helpful coding assistant.',
  history: [] // Start empty or with previous messages
});

// Send first message
const response1 = await chat.sendMessage('What is TypeScript?');
console.log('Assistant:', response1.text);

// Send follow-up (context is automatically maintained)
const response2 = await chat.sendMessage('How do I install it?');
console.log('Assistant:', response2.text);

// Get full chat history
const history = chat.getHistory();
console.log('Full conversation:', history);

typescript

const chat = await ai.models.createChat({
  model: 'gemini-2.5-flash',
  systemInstruction: '你是一个乐于助人的编程助手。',
  history: [] // 从空对话开始，或传入历史消息
});

// 发送第一条消息
const response1 = await chat.sendMessage('什么是TypeScript？');
console.log('助手:', response1.text);

// 发送跟进消息（上下文会自动维护）
const response2 = await chat.sendMessage('如何安装它？');
console.log('助手:', response2.text);

// 获取完整对话历史
const history = chat.getHistory();
console.log('完整对话:', history);

Manual Chat Management (Fetch)

手动管理对话（Fetch）

typescript

const conversationHistory = [];

// First turn
const response1 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{ text: 'What is TypeScript?' }]
        }
      ]
    }),
  }
);

const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;

// Add to history
conversationHistory.push(
  { role: 'user', parts: [{ text: 'What is TypeScript?' }] },
  { role: 'model', parts: [{ text: assistantReply1 }] }
);

// Second turn (include full history)
const response2 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        ...conversationHistory,
        { role: 'user', parts: [{ text: 'How do I install it?' }] }
      ]
    }),
  }
);

Message Roles:

```
user
```
: User messages
```
model
```
: Assistant responses

⚠️ Important: Chat helpers are SDK-only. With fetch, you must manually manage conversation history.

typescript

const conversationHistory = [];

// 第一轮对话
const response1 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        {
          role: 'user',
          parts: [{ text: '什么是TypeScript？' }]
        }
      ]
    }),
  }
);

const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;

// 添加到历史记录
conversationHistory.push(
  { role: 'user', parts: [{ text: '什么是TypeScript？' }] },
  { role: 'model', parts: [{ text: assistantReply1 }] }
);

// 第二轮对话（包含完整历史记录）
const response2 = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        ...conversationHistory,
        { role: 'user', parts: [{ text: '如何安装它？' }] }
      ]
    }),
  }
);

消息角色:

```
user
```
: 用户消息
```
model
```
: 助手响应

⚠️ 重要提示: 对话助手是SDK专属功能。使用Fetch时，必须手动管理对话历史。

Thinking Mode

思考模式

Gemini 2.5 models have thinking mode enabled by default for enhanced quality. You can configure the thinking budget.

Gemini 2.5模型默认开启思考模式以提升质量。您可以配置思考预算。

Configure Thinking Budget (SDK)

配置思考预算（SDK）

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Solve this complex math problem: ...',
  config: {
    thinkingConfig: {
      thinkingBudget: 8192 // Max tokens for thinking (default: model-dependent)
    }
  }
});

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '解决这个复杂的数学问题: ...',
  config: {
    thinkingConfig: {
      thinkingBudget: 8192 // 最大思考token数（默认值取决于模型）
    }
  }
});

Configure Thinking Budget (Fetch)

配置思考预算（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Solve this complex math problem: ...' }] }],
      generationConfig: {
        thinkingConfig: {
          thinkingBudget: 8192
        }
      }
    }),
  }
);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '解决这个复杂的数学问题: ...' }] }],
      generationConfig: {
        thinkingConfig: {
          thinkingBudget: 8192
        }
      }
    }),
  }
);

Configure Thinking Level (SDK) - New in v1.30.0

配置思考级别（SDK）- v1.30.0新增

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Solve this complex problem: ...',
  config: {
    thinkingConfig: {
      thinkingLevel: 'MEDIUM' // 'LOW' | 'MEDIUM' | 'HIGH'
    }
  }
});

Thinking Levels:

```
LOW
```
: Minimal internal reasoning (faster, lower quality)
```
MEDIUM
```
: Balanced reasoning (default)
```
HIGH
```
: Maximum reasoning depth (slower, higher quality)

Key Points:

Thinking mode is always enabled on Gemini 2.5 models (cannot be disabled)
Higher thinking budgets allow more internal reasoning (may increase latency)
```
thinkingLevel
```
provides simpler control than
```
thinkingBudget
```
(new in v1.30.0)
Default budget varies by model (usually sufficient for most tasks)
Only increase budget/level for very complex reasoning tasks

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '解决这个复杂的问题: ...',
  config: {
    thinkingConfig: {
      thinkingLevel: 'MEDIUM' // 'LOW' | 'MEDIUM' | 'HIGH'
    }
  }
});

思考级别说明:

```
LOW
```
: 最小内部推理（速度快，质量较低）
```
MEDIUM
```
: 平衡的推理（默认）
```
HIGH
```
: 最大推理深度（速度慢，质量较高）

关键点:

Gemini 2.5模型始终开启思考模式（无法关闭）
更高的思考预算允许更多内部推理（可能增加延迟）
```
thinkingLevel
```
比
```
thinkingBudget
```
提供更简单的控制（v1.30.0新增）
默认预算因模型而异（通常足以应对大多数任务）
仅在处理非常复杂的推理任务时才增加预算/级别

Generation Configuration

生成配置

Customize model behavior with generation parameters.

使用生成参数自定义模型行为。

All Configuration Options (SDK)

所有配置选项（SDK）

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Write a creative story',
  config: {
    temperature: 0.9,           // Randomness (0.0-2.0, default: 1.0)
    topP: 0.95,                 // Nucleus sampling (0.0-1.0)
    topK: 40,                   // Top-k sampling
    maxOutputTokens: 2048,      // Max tokens to generate
    stopSequences: ['END'],     // Stop generation if these appear
    responseMimeType: 'text/plain', // Or 'application/json' for JSON mode
    candidateCount: 1           // Number of response candidates (usually 1)
  }
});

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '写一个创意故事',
  config: {
    temperature: 0.9,           // 随机性（0.0-2.0，默认值：1.0）
    topP: 0.95,                 // 核采样（0.0-1.0）
    topK: 40,                   // Top-k采样
    maxOutputTokens: 2048,      // 最大生成token数
    stopSequences: ['END'],     // 如果出现这些序列则停止生成
    responseMimeType: 'text/plain', // 或使用'application/json'开启JSON模式
    candidateCount: 1           // 响应候选数（通常为1）
  }
});

All Configuration Options (Fetch)

所有配置选项（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: 'Write a creative story' }] }],
      generationConfig: {
        temperature: 0.9,
        topP: 0.95,
        topK: 40,
        maxOutputTokens: 2048,
        stopSequences: ['END'],
        responseMimeType: 'text/plain',
        candidateCount: 1
      }
    }),
  }
);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [{ parts: [{ text: '写一个创意故事' }] }],
      generationConfig: {
        temperature: 0.9,
        topP: 0.95,
        topK: 40,
        maxOutputTokens: 2048,
        stopSequences: ['END'],
        responseMimeType: 'text/plain',
        candidateCount: 1
      }
    }),
  }
);

Parameter Guidelines

参数指南

Parameter	Range	Default	Use Case
temperature	0.0-2.0	1.0	Lower = more focused, higher = more creative
topP	0.0-1.0	0.95	Nucleus sampling threshold
topK	1-100+	40	Limit to top K tokens
maxOutputTokens	1-65536	Model max	Control response length
stopSequences	Array	None	Stop generation at specific strings

Tips:

For factual tasks: Use low temperature (0.0-0.3)
For creative tasks: Use high temperature (0.7-1.5)
topP and topK both control randomness; use one or the other (not both)
Always set maxOutputTokens to prevent excessive generation

参数	范围	默认值	适用场景
temperature	0.0-2.0	1.0	值越低越聚焦，值越高越有创意
topP	0.0-1.0	0.95	核采样阈值
topK	1-100+	40	限制仅考虑前K个token
maxOutputTokens	1-65536	模型最大值	控制响应长度
stopSequences	数组	无	当出现指定序列时停止生成

提示:

对于事实性任务: 使用低temperature（0.0-0.3）
对于创意任务: 使用高temperature（0.7-1.5）
topP和topK都用于控制随机性，使用其中一个即可（不要同时使用）
始终设置maxOutputTokens以避免过度生成

Context Caching

上下文缓存

Context caching allows you to cache frequently used content (like system instructions, large documents, or video files) to reduce costs by up to 90% and improve latency.

上下文缓存允许您缓存频繁使用的内容（如系统指令、大型文档或视频文件），可降低高达90%的成本并提升延迟性能。

How It Works

工作原理

Create a cache with your repeated content
Reference the cache in subsequent requests
Save tokens - cached tokens cost significantly less
TTL management - caches expire after specified time

创建缓存：将重复使用的内容存入缓存
引用缓存：在后续请求中引用该缓存
节省token：缓存的token成本远低于普通token
TTL管理：缓存会在指定时间后过期

Benefits

优势

Cost savings: Up to 90% reduction on cached tokens
Reduced latency: Faster responses by reusing processed content
Consistent context: Same large context across multiple requests

成本节约：缓存的输入token比普通token便宜约90%
延迟降低：通过复用已处理内容提升响应速度
上下文一致：在多个请求中保持相同的大型上下文

Cache Creation (SDK)

创建缓存（SDK）

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Create a cache for a large document
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');

const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'large-doc-cache', // Identifier for the cache
    systemInstruction: 'You are an expert at analyzing legal documents.',
    contents: documentText,
    ttl: '3600s', // Cache for 1 hour
  }
});

console.log('Cache created:', cache.name);
console.log('Expires at:', cache.expireTime);

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 为大型文档创建缓存
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');

const cache = await ai.caches.create({
  model: 'gemini-1.5-flash-001', // 注意：仅Gemini 1.5模型支持缓存
  config: {
    displayName: 'large-doc-cache', // 缓存标识
    systemInstruction: '你是一名法律文档分析专家。',
    contents: documentText,
    ttl: '3600s', // 缓存1小时
  }
});

console.log('缓存已创建:', cache.name);
console.log('过期时间:', cache.expireTime);

Cache Creation (Fetch)

创建缓存（Fetch）

typescript

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/cachedContents',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      model: 'models/gemini-2.5-flash',
      displayName: 'large-doc-cache',
      systemInstruction: {
        parts: [{ text: 'You are an expert at analyzing legal documents.' }]
      },
      contents: [
        { parts: [{ text: documentText }] }
      ],
      ttl: '3600s'
    }),
  }
);

const cache = await response.json();
console.log('Cache created:', cache.name);

typescript

const response = await fetch(
  'https://generativelanguage.googleapis.com/v1beta/cachedContents',
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      model: 'models/gemini-1.5-flash-001', // 注意：仅Gemini 1.5模型支持缓存
      displayName: 'large-doc-cache',
      systemInstruction: {
        parts: [{ text: '你是一名法律文档分析专家。' }]
      },
      contents: [
        { parts: [{ text: documentText }] }
      ],
      ttl: '3600s'
    }),
  }
);

const cache = await response.json();
console.log('缓存已创建:', cache.name);

Using a Cache (SDK)

使用缓存（SDK）

typescript

// Generate content using the cache
const response = await ai.models.generateContent({
  model: cache.name, // Use cache name as model
  contents: 'Summarize the key points in the document'
});

console.log(response.text);

typescript

// 使用缓存生成内容
const response = await ai.models.generateContent({
  model: cache.name, // 将缓存名称作为模型参数
  contents: '总结文档的关键点'
});

console.log(response.text);

Using a Cache (Fetch)

使用缓存（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'Summarize the key points in the document' }] }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '总结文档的关键点' }] }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

Update Cache TTL (SDK)

更新缓存TTL（SDK）

typescript

import { UpdateCachedContentConfig } from '@google/genai';

await ai.caches.update({
  name: cache.name,
  config: {
    ttl: '7200s' // Extend to 2 hours
  }
});

typescript

import { UpdateCachedContentConfig } from '@google/genai';

await ai.caches.update({
  name: cache.name,
  config: {
    ttl: '7200s' // 延长至2小时
  }
});

Update Cache with Expiration Time (SDK)

使用过期时间更新缓存（SDK）

typescript

// Set specific expiration time (must be timezone-aware)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);

await ai.caches.update({
  name: cache.name,
  config: {
    expireTime: in10Minutes
  }
});

typescript

// 设置具体的过期时间（必须包含时区）
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);

await ai.caches.update({
  name: cache.name,
  config: {
    expireTime: in10Minutes
  }
});

List and Delete Caches (SDK)

列出和删除缓存（SDK）

typescript

// List all caches
const caches = await ai.caches.list();
for (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

// Delete a specific cache
await ai.caches.delete({ name: cache.name });

typescript

// 列出所有缓存
const caches = await ai.caches.list();
for (const cache of caches) {
  console.log(cache.name, cache.displayName);
}

// 删除指定缓存
await ai.caches.delete({ name: cache.name });

Caching with Video Files

视频文件缓存

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// Upload video file
const videoFile = await ai.files.upload({
  file: fs.createReadStream('./video.mp4')
});

// Wait for processing
while (videoFile.state.name === 'PROCESSING') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await ai.files.get({ name: videoFile.name });
}

// Create cache with video
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash',
  config: {
    displayName: 'video-analysis-cache',
    systemInstruction: 'You are an expert video analyzer.',
    contents: [videoFile],
    ttl: '300s' // 5 minutes
  }
});

// Use cache for multiple queries
const response1 = await ai.models.generateContent({
  model: cache.name,
  contents: 'What happens in the first minute?'
});

const response2 = await ai.models.generateContent({
  model: cache.name,
  contents: 'Describe the main characters'
});

typescript

import { GoogleGenAI } from '@google/genai';
import fs from 'fs';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// 上传视频文件
const videoFile = await ai.files.upload({
  file: fs.createReadStream('./video.mp4')
});

// 等待处理完成
while (videoFile.state.name === 'PROCESSING') {
  await new Promise(resolve => setTimeout(resolve, 2000));
  videoFile = await ai.files.get({ name: videoFile.name });
}

// 创建包含视频的缓存
const cache = await ai.caches.create({
  model: 'gemini-1.5-flash-001', // 注意：仅Gemini 1.5模型支持缓存
  config: {
    displayName: 'video-analysis-cache',
    systemInstruction: '你是一名专业的视频分析师。',
    contents: [videoFile],
    ttl: '300s' // 缓存5分钟
  }
});

// 使用缓存进行多次查询
const response1 = await ai.models.generateContent({
  model: cache.name,
  contents: '视频第一分钟发生了什么？'
});

const response2 = await ai.models.generateContent({
  model: cache.name,
  contents: '描述主要角色'
});

Key Points

关键点

When to Use Caching:

Large system instructions used repeatedly
Long documents analyzed multiple times
Video/audio files queried with different prompts
Consistent context across conversation sessions

TTL Guidelines:

Short sessions: 300s (5 min) to 3600s (1 hour)
Long sessions: 3600s (1 hour) to 86400s (24 hours)
Maximum: 7 days

Cost Savings:

Cached input tokens: ~90% cheaper than regular tokens
Output tokens: Same price (not cached)

Important:

You must use explicit model version suffixes (e.g.,
```
gemini-2.5-flash-001
```
, NOT just
```
gemini-2.5-flash
```
)
Caches are automatically deleted after TTL expires
Update TTL before expiration to extend cache lifetime

何时使用缓存:

重复使用的大型系统指令
需要多次分析的长文档
需要用不同查询提问的视频/音频文件
跨对话会话的一致上下文

TTL指南:

短会话：300s（5分钟）至3600s（1小时）
长会话：3600s（1小时）至86400s（24小时）
最大值：7天

成本节约:

缓存的输入token：比普通token便宜约90%
输出token：价格不变（不缓存）

重要提示:

必须使用明确的模型版本后缀（例如：
```
gemini-1.5-flash-001
```
，不能仅使用
```
gemini-1.5-flash
```
）
缓存会在TTL过期后自动删除
在过期前更新TTL以延长缓存生命周期
仅Gemini 1.5模型支持上下文缓存

Code Execution

代码执行

Gemini models can generate and execute Python code to solve problems requiring computation, data analysis, or visualization.

Gemini模型可以生成并执行Python代码，解决需要计算、数据分析或可视化的问题。

How It Works

工作原理

Model generates executable Python code
Code runs in secure sandbox
Results are returned to the model
Model incorporates results into response

模型生成可执行的Python代码
代码在安全沙箱中运行
结果返回给模型
模型将结果整合到响应中

Supported Operations

支持的操作

Mathematical calculations
Data analysis and statistics
File processing (CSV, JSON, etc.)
Chart and graph generation
Algorithm implementation
Data transformations

数学计算
数据分析和统计
文件处理（CSV、JSON等）
图表和图形生成
算法实现
数据转换

Available Python Packages

可用的Python包

Standard Library:

```
math
```
,
```
statistics
```
,
```
random
```
,
```
datetime
```
,
```
json
```
,
```
csv
```
,
```
re
```
```
collections
```
,
```
itertools
```
,
```
functools
```

Data Science:

```
numpy
```
,
```
pandas
```
,
```
scipy
```

Visualization:

```
matplotlib
```
,
```
seaborn
```

Note: Limited package availability compared to full Python environment

标准库:

```
math
```
,
```
statistics
```
,
```
random
```
,
```
datetime
```
,
```
json
```
,
```
csv
```
,
```
re
```
```
collections
```
,
```
itertools
```
,
```
functools
```

数据科学:

```
numpy
```
,
```
pandas
```
,
```
scipy
```

可视化:

```
matplotlib
```
,
```
seaborn
```

注意: 与完整Python环境相比，可用包有限

Basic Code Execution (SDK)

基础代码执行（SDK）

typescript

import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the sum of the first 50 prime numbers? Generate and run code for the calculation.',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Parse response parts
for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    console.log('Text:', part.text);
  }
  if (part.executableCode) {
    console.log('Generated Code:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('Execution Output:', part.codeExecutionResult.output);
  }
}

typescript

import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '前50个质数的和是多少？生成并运行计算代码。',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 解析响应部分
for (const part of response.candidates[0].content.parts) {
  if (part.text) {
    console.log('文本:', part.text);
  }
  if (part.executableCode) {
    console.log('生成的代码:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('执行输出:', part.codeExecutionResult.output);
  }
}

Basic Code Execution (Fetch)

基础代码执行（Fetch）

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      tools: [{ code_execution: {} }],
      contents: [
        {
          parts: [
            { text: 'What is the sum of the first 50 prime numbers? Generate and run code.' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();

for (const part of data.candidates[0].content.parts) {
  if (part.text) {
    console.log('Text:', part.text);
  }
  if (part.executableCode) {
    console.log('Code:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('Result:', part.codeExecutionResult.output);
  }
}

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      tools: [{ code_execution: {} }],
      contents: [
        {
          parts: [
            { text: '前50个质数的和是多少？生成并运行计算代码。' }
          ]
        }
      ]
    }),
  }
);

const data = await response.json();

for (const part of data.candidates[0].content.parts) {
  if (part.text) {
    console.log('文本:', part.text);
  }
  if (part.executableCode) {
    console.log('代码:', part.executableCode.code);
  }
  if (part.codeExecutionResult) {
    console.log('结果:', part.codeExecutionResult.output);
  }
}

Chat with Code Execution (SDK)

带代码执行的对话（SDK）

typescript

const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

let response = await chat.sendMessage('I have a math question for you.');
console.log(response.text);

response = await chat.sendMessage(
  'Calculate the Fibonacci sequence up to the 20th number and sum them.'
);

// Model will generate and execute code, then provide answer
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Code:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('Output:', part.codeExecutionResult.output);
}

typescript

const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

let response = await chat.sendMessage('我有一个数学问题想请教你。');
console.log(response.text);

response = await chat.sendMessage(
  '计算斐波那契数列的前20项并求和。'
);

// 模型会生成并执行代码，然后给出答案
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('代码:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('输出:', part.codeExecutionResult.output);
}

Data Analysis Example

数据分析示例

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: `
    Analyze this sales data and calculate:
    1. Total revenue
    2. Average sale price
    3. Best-selling month

    Data (CSV format):
    month,sales,revenue
    Jan,150,45000
    Feb,200,62000
    Mar,175,53000
    Apr,220,68000
  `,
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Model will generate pandas/numpy code to analyze data
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Analysis Code:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('Results:', part.codeExecutionResult.output);
}

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: `
    分析这份销售数据并计算:
    1. 总营收
    2. 平均售价
    3. 最畅销的月份

    数据（CSV格式）:
    month,sales,revenue
    Jan,150,45000
    Feb,200,62000
    Mar,175,53000
    Apr,220,68000
  `,
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 模型会生成pandas/numpy代码来分析数据
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('分析代码:', part.executableCode.code);
  if (part.codeExecutionResult) console.log('结果:', part.codeExecutionResult.output);
}

Visualization Example

可视化示例

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Create a bar chart showing the distribution of prime numbers under 100 by their last digit. Generate the chart and describe the pattern.',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// Model generates matplotlib code, executes it, and describes results
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('Chart Code:', part.executableCode.code);
  if (part.codeExecutionResult) {
    // Note: Chart image data would be in output
    console.log('Execution completed');
  }
}

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '创建一个柱状图，展示100以内质数的末位数字分布。生成图表并描述模式。',
  config: {
    tools: [{ codeExecution: {} }]
  }
});

// 模型生成matplotlib代码，执行后描述结果
for (const part of response.candidates[0].content.parts) {
  if (part.text) console.log(part.text);
  if (part.executableCode) console.log('图表代码:', part.executableCode.code);
  if (part.codeExecutionResult) {
    // 注意：图表图片数据会在输出中
    console.log('执行完成');
  }
}

Response Structure

响应结构

typescript

{
  candidates: [
    {
      content: {
        parts: [
          { text: "I'll calculate that for you." },
          {
            executableCode: {
              language: "PYTHON",
              code: "def is_prime(n):\n  if n <= 1:\n    return False\n  ..."
            }
          },
          {
            codeExecutionResult: {
              outcome: "OUTCOME_OK", // or "OUTCOME_FAILED"
              output: "5117\n"
            }
          },
          { text: "The sum of the first 50 prime numbers is 5117." }
        ]
      }
    }
  ]
}

typescript

{
  candidates: [
    {
      content: {
        parts: [
          { text: "我来帮你计算。" },
          {
            executableCode: {
              language: "PYTHON",
              code: "def is_prime(n):\n  if n <= 1:\n    return False\n  ..."
            }
          },
          {
            codeExecutionResult: {
              outcome: "OUTCOME_OK", // 或"OUTCOME_FAILED"
              output: "5117\n"
            }
          },
          { text: "前50个质数的和是5117。" }
        ]
      }
    }
  ]
}

Error Handling

错误处理

typescript

for (const part of response.candidates[0].content.parts) {
  if (part.codeExecutionResult) {
    if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
      console.error('Code execution failed:', part.codeExecutionResult.output);
    } else {
      console.log('Success:', part.codeExecutionResult.output);
    }
  }
}

typescript

for (const part of response.candidates[0].content.parts) {
  if (part.codeExecutionResult) {
    if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
      console.error('代码执行失败:', part.codeExecutionResult.output);
    } else {
      console.log('成功:', part.codeExecutionResult.output);
    }
  }
}

Key Points

关键点

When to Use Code Execution:

Complex mathematical calculations
Data analysis and statistics
Algorithm implementations
File parsing and processing
Chart generation
Computational problems

Limitations:

Sandbox environment (limited file system access)
Limited Python package availability
Execution timeout limits
No network access from code
No persistent state between executions

Best Practices:

Specify what calculation or analysis you need clearly
Request code generation explicitly ("Generate and run code...")
Check
```
outcome
```
field for errors
Use for deterministic computations, not for general programming

Important:

Available on all Gemini 2.5 models (Pro, Flash, Flash-Lite)
Code runs in isolated sandbox for security
Supports Python with standard library and common data science packages

何时使用代码执行:

复杂数学计算
数据分析和统计
算法实现
文件解析和处理
图表生成
计算类问题

限制:

沙箱环境（文件系统访问受限）
Python包可用性有限
执行超时限制
代码无法访问网络
执行之间无持久化状态

最佳实践:

明确说明您需要的计算或分析
明确要求生成代码（"生成并运行代码..."）
检查
```
outcome
```
字段是否有错误
用于确定性计算，而非通用编程

重要提示:

所有Gemini 2.5模型（Pro、Flash、Flash-Lite）均支持
代码在隔离沙箱中运行，保障安全
支持Python标准库和常见数据科学包

Grounding with Google Search

基于Google搜索的事实校验

Grounding connects the model to real-time web information, reducing hallucinations and providing up-to-date, fact-checked responses with citations.

事实校验将模型与实时网络信息连接，减少幻觉并提供最新、经过事实核查的响应，同时附带引用。

How It Works

工作原理

Model determines if it needs current information
Automatically performs Google Search
Processes search results
Incorporates findings into response
Provides citations and source URLs

模型判断是否需要当前信息
自动执行Google搜索
处理搜索结果
将发现整合到响应中
提供引用和来源URL

Benefits

优势

Real-time information: Access to current events and data
Reduced hallucinations: Answers grounded in web sources
Verifiable: Citations allow fact-checking
Up-to-date: Not limited to model's training cutoff

实时信息: 访问当前事件和数据
减少幻觉: 答案基于网络来源
可验证: 引用允许事实核查
内容更新: 不受模型训练截止日期限制

Grounding Options

事实校验选项

1. Google Search (

googleSearch

) - Recommended for Gemini 2.5

1. Google搜索（

googleSearch

）- Gemini 2.5推荐

typescript

const groundingTool = {
  googleSearch: {}
};

Features:

Simple configuration
Automatic search when needed
Available on all Gemini 2.5 models

typescript

const groundingTool = {
  googleSearch: {}
};

特性:

配置简单
需要时自动搜索
所有Gemini 2.5模型均支持

2. FileSearch - New in v1.29.0 (Preview)

2. 文件搜索 - v1.29.0新增（预览版）

typescript

const fileSearchTool = {
  fileSearch: {
    fileSearchStoreId: 'store-id-here' // Created via FileSearchStore APIs
  }
};

Features:

Search through your own document collections
Upload and index custom knowledge bases
Alternative to web search for proprietary data
Preview feature (requires FileSearchStore setup)

Note: See FileSearch documentation for store creation and management.

typescript

const fileSearchTool = {
  fileSearch: {
    fileSearchStoreId: 'store-id-here' // 通过FileSearchStore API创建
  }
};

特性:

搜索您自己的文档集合
上传并索引自定义知识库
专有数据的网络搜索替代方案
预览功能（需要设置FileSearchStore）

注意: 请查看FileSearch文档了解存储创建和管理方法。

3. Google Search Retrieval (

googleSearchRetrieval

) - Legacy (Gemini 1.5)

3. Google搜索检索（

googleSearchRetrieval

）- 旧版（Gemini 1.5）

typescript

const retrievalTool = {
  googleSearchRetrieval: {
    dynamicRetrievalConfig: {
      mode: 'MODE_DYNAMIC',
      dynamicThreshold: 0.7 // Only search if confidence < 70%
    }
  }
};

Features:

Dynamic threshold control
Used with Gemini 1.5 models
More configuration options

typescript

const retrievalTool = {
  googleSearchRetrieval: {
    dynamicRetrievalConfig: {
      mode: 'MODE_DYNAMIC',
      dynamicThreshold: 0.7 // 仅当置信度<70%时搜索
    }
  }
};

特性:

动态阈值控制
用于Gemini 1.5模型
更多配置选项

Basic Grounding (SDK) - Gemini 2.5

基础事实校验（SDK）- Gemini 2.5

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Who won the euro 2024?',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

console.log(response.text);

// Check if grounding was used
if (response.candidates[0].groundingMetadata) {
  console.log('Search was performed!');
  console.log('Sources:', response.candidates[0].groundingMetadata);
}

typescript

import { GoogleGenAI } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '2024年欧洲杯冠军是谁？',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

console.log(response.text);

// 检查是否使用了事实校验
if (response.candidates[0].groundingMetadata) {
  console.log('已执行搜索！');
  console.log('来源:', response.candidates[0].groundingMetadata);
}

Basic Grounding (Fetch) - Gemini 2.5

基础事实校验（Fetch）- Gemini 2.5

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: 'Who won the euro 2024?' }] }
      ],
      tools: [
        { google_search: {} }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

if (data.candidates[0].groundingMetadata) {
  console.log('Grounding metadata:', data.candidates[0].groundingMetadata);
}

typescript

const response = await fetch(
  `https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
  {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'x-goog-api-key': env.GEMINI_API_KEY,
    },
    body: JSON.stringify({
      contents: [
        { parts: [{ text: '2024年欧洲杯冠军是谁？' }] }
      ],
      tools: [
        { google_search: {} }
      ]
    }),
  }
);

const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);

if (data.candidates[0].groundingMetadata) {
  console.log('事实校验元数据:', data.candidates[0].groundingMetadata);
}

Dynamic Retrieval (SDK) - Gemini 1.5

动态检索（SDK）- Gemini 1.5

typescript

import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Who won the euro 2024?',
  config: {
    tools: [
      {
        googleSearchRetrieval: {
          dynamicRetrievalConfig: {
            mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
            dynamicThreshold: 0.7 // Search only if confidence < 70%
          }
        }
      }
    ]
  }
});

console.log(response.text);

if (!response.candidates[0].groundingMetadata) {
  console.log('Model answered from its own knowledge (high confidence)');
}

typescript

import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';

const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

const response = await ai.models.generateContent({
  model: 'gemini-1.5-flash',
  contents: '2024年欧洲杯冠军是谁？',
  config: {
    tools: [
      {
        googleSearchRetrieval: {
          dynamicRetrievalConfig: {
            mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
            dynamicThreshold: 0.7 // 仅当置信度<70%时搜索
          }
        }
      }
    ]
  }
});

console.log(response.text);

if (!response.candidates[0].groundingMetadata) {
  console.log('模型使用自身知识回答（高置信度）');
}

Grounding Metadata Structure

事实校验元数据结构

typescript

{
  groundingMetadata: {
    searchQueries: [
      { text: "euro 2024 winner" }
    ],
    webPages: [
      {
        url: "https://example.com/euro-2024-results",
        title: "UEFA Euro 2024 Final Results",
        snippet: "Spain won UEFA Euro 2024..."
      }
    ],
    citations: [
      {
        startIndex: 42,
        endIndex: 47,
        uri: "https://example.com/euro-2024-results"
      }
    ],
    retrievalQueries: [
      {
        query: "who won euro 2024 final"
      }
    ]
  }
}

typescript

{
  groundingMetadata: {
    searchQueries: [
      { text: "euro 2024 winner" }
    ],
    webPages: [
      {
        url: "https://example.com/euro-2024-results",
        title: "UEFA Euro 2024 Final Results",
        snippet: "Spain won UEFA Euro 2024..."
      }
    ],
    citations: [
      {
        startIndex: 42,
        endIndex: 47,
        uri: "https://example.com/euro-2024-results"
      }
    ],
    retrievalQueries: [
      {
        query: "who won euro 2024 final"
      }
    ]
  }
}

Chat with Grounding (SDK)

带事实校验的对话（SDK）

typescript

const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

let response = await chat.sendMessage('What are the latest developments in quantum computing?');
console.log(response.text);

// Check grounding sources
if (response.candidates[0].groundingMetadata) {
  const sources = response.candidates[0].groundingMetadata.webPages || [];
  console.log(`Sources used: ${sources.length}`);
  sources.forEach(source => {
    console.log(`- ${source.title}: ${source.url}`);
  });
}

// Follow-up still has grounding enabled
response = await chat.sendMessage('Which company made the biggest breakthrough?');
console.log(response.text);

typescript

const chat = await ai.chats.create({
  model: 'gemini-2.5-flash',
  config: {
    tools: [{ googleSearch: {} }]
  }
});

let response = await chat.sendMessage('量子计算的最新进展是什么？');
console.log(response.text);

// 检查事实校验来源
if (response.candidates[0].groundingMetadata) {
  const sources = response.candidates[0].groundingMetadata.webPages || [];
  console.log(`使用的来源: ${sources.length}`);
  sources.forEach(source => {
    console.log(`- ${source.title}: ${source.url}`);
  });
}

// 跟进消息仍会启用事实校验
response = await chat.sendMessage('哪家公司取得了最大突破？');
console.log(response.text);

Combining Grounding with Function Calling

事实校验与函数调用结合

typescript

const weatherFunction = {
  name: 'get_current_weather',
  description: 'Get current weather for a location',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: { type: 'string', description: 'City name' }
    },
    required: ['location']
  }
};

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is the weather like in the city that won Euro 2024?',
  config: {
    tools: [
      { googleSearch: {} },
      { functionDeclarations: [weatherFunction] }
    ]
  }
});

// Model will:
// 1. Use Google Search to find Euro 2024 winner
// 2. Call get_current_weather function with the city
// 3. Combine both results in response

typescript

const weatherFunction = {
  name: 'get_current_weather',
  description: '获取指定地点的当前天气',
  parametersJsonSchema: {
    type: 'object',
    properties: {
      location: { type: 'string', description: '城市名称' }
    },
    required: ['location']
  }
};

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '2024年欧洲杯冠军城市的天气怎么样？',
  config: {
    tools: [
      { googleSearch: {} },
      { functionDeclarations: [weatherFunction] }
    ]
  }
});

// 模型会:
// 1. 使用Google搜索找到2024年欧洲杯冠军
// 2. 调用get_current_weather函数获取该城市的天气
// 3. 将两个结果整合到响应中

Checking if Grounding was Used

检查是否使用了事实校验

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'What is 2+2?', // Model knows this without search
  config: {
    tools: [{ googleSearch: {} }]
  }
});

if (!response.candidates[0].groundingMetadata) {
  console.log('Model answered from its own knowledge (no search needed)');
} else {
  console.log('Search was performed');
}

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '2+2等于多少？', // 模型无需搜索即可回答
  config: {
    tools: [{ googleSearch: {} }]
  }
});

if (!response.candidates[0].groundingMetadata) {
  console.log('模型使用自身知识回答（无需搜索）');
} else {
  console.log('已执行搜索');
}

Key Points

关键点

When to Use Grounding:

Current events and news
Real-time data (stock prices, sports scores, weather)
Fact-checking and verification
Questions about recent developments
Information beyond model's training cutoff

When NOT to Use:

General knowledge questions
Mathematical calculations
Code generation
Creative writing
Tasks requiring internal reasoning only

Cost Considerations:

Grounding adds latency (search takes time)
Additional token costs for retrieved content
Use
```
dynamicThreshold
```
to control when searches happen (Gemini 1.5)

Important Notes:

Grounding requires Google Cloud project (not just API key)
Search results quality depends on query phrasing
Citations may not cover all facts in response
Search is performed automatically based on confidence

Gemini 2.5 vs 1.5:

Gemini 2.5: Use
```
googleSearch
```
(simple, recommended)
Gemini 1.5: Use
```
googleSearchRetrieval
```
with
```
dynamicThreshold
```

Best Practices:

Always check
```
groundingMetadata
```
to see if search was used
Display citations to users for transparency
Use specific, well-phrased questions for better search results
Combine with function calling for hybrid workflows

何时使用事实校验:

当前事件和新闻
实时数据（股票价格、体育比分、天气）
事实核查和验证
关于最新进展的问题
超出模型训练截止日期的信息

何时不使用:

常识性问题
数学计算
代码生成
创意写作
仅需内部推理的任务

成本考虑:

事实校验会增加延迟（搜索需要时间）
检索内容会产生额外token成本
使用
```
dynamicThreshold
```
控制搜索时机（Gemini 1.5）

重要提示:

事实校验需要Google Cloud项目（不仅仅是API密钥）
搜索结果质量取决于查询措辞
引用可能无法覆盖响应中的所有事实
搜索会根据置信度自动执行

Gemini 2.5 vs 1.5:

Gemini 2.5: 使用
```
googleSearch
```
（简单，推荐）
Gemini 1.5: 使用
```
googleSearchRetrieval
```
并设置
```
dynamicThreshold
```

最佳实践:

始终检查
```
groundingMetadata
```
以确认是否执行了搜索
向用户展示引用以保证透明度
使用具体、措辞清晰的问题以获得更好的搜索结果
与函数调用结合实现混合工作流

Known Issues Prevention

已知问题预防

This skill prevents 14 documented issues:

本指南可预防14个已记录的问题:

Issue #1: Multi-byte Character Corruption in Streaming

问题#1: 流式输出中的多字节字符损坏

Error: Garbled text or � symbols when streaming responses with non-English text Source: GitHub Issue #764 Why It Happens: The

TextDecoder

converts chunks to strings without the

{stream: true}

option. Multi-byte UTF-8 characters (Chinese, Japanese, Korean, emoji) split across chunks create invalid strings.

Prevention:

typescript

// The SDK already fixes this, but if implementing custom streaming:
const decoder = new TextDecoder();
const { value } = await reader.read();
const text = decoder.decode(value, { stream: true }); // ← stream: true required

Affected: All non-English languages using multi-byte characters Status: Fixed in SDK, but documented for custom implementations

错误: 流式输出非英文文本时出现乱码或�符号来源: GitHub Issue #764 原因:

TextDecoder

转换块为字符串时未使用

{stream: true}

选项。多字节UTF-8字符（中文、日文、韩文、表情符号）被拆分到不同块中，导致无效字符串。

预防措施:

typescript

// SDK已修复此问题，但如果是自定义流式实现:
const decoder = new TextDecoder();
const { value } = await reader.read();
const text = decoder.decode(value, { stream: true }); // ← 必须设置stream: true

影响范围: 所有使用多字节字符的非英语语言状态: SDK中已修复，为自定义实现提供文档说明

Issue #2: Safety Settings Method Parameter Not Supported

问题#2: 安全设置的method参数不被支持

Error: "method parameter is not supported in Gemini API" Source: GitHub Issue #810 Why It Happens: The

method

parameter in

safetySettings

only works with Vertex AI Gemini API, not Gemini Developer API or Google AI Studio. The SDK allows passing it without validation.

Prevention:

typescript

// ❌ WRONG - Fails with Gemini Developer API:
config: {
  safetySettings: [{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    method: HarmBlockMethod.SEVERITY // Not supported!
  }]
}

// ✅ CORRECT - Omit 'method' for Gemini Developer API:
config: {
  safetySettings: [{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
    // No 'method' field
  }]
}

Affected: Gemini Developer API and Google AI Studio users Status: Known limitation, use Vertex AI if you need

method

parameter

错误: "method parameter is not supported in Gemini API" 来源: GitHub Issue #810 原因:

safetySettings

中的

method

参数仅适用于Vertex AI Gemini API，不适用于Gemini开发者API或Google AI Studio。SDK允许传递该参数但未进行验证。

预防措施:

typescript

// ❌ 错误 - 使用Gemini开发者API会失败:
config: {
  safetySettings: [{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
    method: HarmBlockMethod.SEVERITY // 不被支持！
  }]
}

// ✅ 正确 - 针对Gemini开发者API省略'method':
config: {
  safetySettings: [{
    category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
    threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
    // 无'method'字段
  }]
}

影响范围: Gemini开发者API和Google AI Studio用户状态: 已知限制，如果需要

method

参数请使用Vertex AI

Issue #3: Safety Settings Have Model-Specific Thresholds

问题#3: 安全设置具有模型特定阈值

Error: Content passes through despite strict safety settings, or

safetyRatings

shows NEGLIGIBLE with empty output Source: GitHub Issue #872 Why It Happens: Different models have different blocking thresholds.

gemini-2.5-flash

blocks more strictly than

gemini-2.0-flash

. Additionally,

promptFeedback

only appears when INPUT is blocked; if the model generates a refusal message,

safetyRatings

may show NEGLIGIBLE.

Prevention:

typescript

// Check BOTH promptFeedback AND empty response:
if (response.candidates[0].finishReason === 'SAFETY' ||
    !response.text || response.text.trim() === '') {
  console.log('Content blocked or refused');
}

// Be aware: Different models have different thresholds
// gemini-2.5-flash: Lower threshold (stricter blocking)
// gemini-2.0-flash: Higher threshold (more permissive)

Affected: All models when using safety settings Status: Known behavior, model-specific thresholds are by design

错误: 尽管设置了严格的安全设置，内容仍通过；或

safetyRatings

显示NEGLIGIBLE但输出为空来源: GitHub Issue #872 原因: 不同模型具有不同的拦截阈值。

gemini-2.5-flash

比

gemini-2.0-flash

拦截更严格。此外，只有当输入被拦截时才会出现

promptFeedback

；如果模型生成拒绝消息，

safetyRatings

可能显示NEGLIGIBLE。

预防措施:

typescript

// 同时检查promptFeedback和空响应:
if (response.candidates[0].finishReason === 'SAFETY' ||
    !response.text || response.text.trim() === '') {
  console.log('内容被拦截或拒绝');
}

// 注意: 不同模型具有不同阈值
// gemini-2.5-flash: 阈值更低（拦截更严格）
// gemini-2.0-flash: 阈值更高（更宽松）

影响范围: 使用安全设置的所有模型状态: 已知行为，模型特定阈值为设计如此

Issue #4: FunctionCallingConfigMode.ANY Causes Infinite Loop

问题#4: FunctionCallingConfigMode.ANY导致无限循环

Error: Model loops forever calling tools, never returns text response Source: GitHub Issue #908 Why It Happens: When

FunctionCallingConfigMode.ANY

is set with automatic function calling (

CallableTool

), the model is forced to call at least one tool on every turn and physically cannot stop, looping until max invocations limit.

Prevention:

typescript

// ❌ WRONG - Loops forever:
config: {
  toolConfig: {
    functionCallingConfig: {
      mode: FunctionCallingConfigMode.ANY // Forces tool calls forever
    }
  }
}

// ✅ CORRECT - Use AUTO mode (model decides):
config: {
  toolConfig: {
    functionCallingConfig: {
      mode: FunctionCallingConfigMode.AUTO // Model can choose to answer directly
    }
  }
}

// Or use manual function calling (check for functionCall, execute, send back)

Affected: Automatic function calling with

CallableTool

Status: Known limitation, use AUTO mode or manual function calling

错误: 模型无限循环调用工具，从不返回文本响应来源: GitHub Issue #908 原因: 当使用自动函数调用（

CallableTool

）并设置

FunctionCallingConfigMode.ANY

时，模型被强制在每一轮至少调用一个工具，无法停止，直到达到最大调用次数限制。

预防措施:

typescript

// ❌ 错误 - 无限循环:
config: {
  toolConfig: {
    functionCallingConfig: {
      mode: FunctionCallingConfigMode.ANY // 强制永远调用工具
    }
  }
}

// ✅ 正确 - 使用AUTO模式（模型自主决定）:
config: {
  toolConfig: {
    functionCallingConfig: {
      mode: FunctionCallingConfigMode.AUTO // 模型可以选择直接回答
    }
  }
}

// 或使用手动函数调用（检查functionCall，执行后返回结果）

影响范围: 使用

CallableTool

的自动函数调用状态: 已知限制，使用AUTO模式或手动函数调用

Issue #5: Structured Output Doesn't Preserve Escaped Backslashes (Gemini 3)

问题#5: 结构化输出无法保留转义反斜杠（Gemini 3）

Error:

JSON.parse

fails on structured output, or keys with backslashes are incorrect Source: GitHub Issue #1226 Why It Happens: When using

responseMimeType: "application/json"

with schema keys containing escaped backslashes (e.g.,

\\a

for key

\a

), the model output doesn't preserve JSON escaping. It emits a single backslash, causing invalid JSON.

Prevention:

typescript

// Avoid using backslashes in JSON schema keys
// Or manually post-process if required:
let jsonText = response.text;
// Add custom escaping logic if needed

Affected: Gemini 3 models with structured output using backslashes in keys Status: Known issue, workaround required

错误: 解析结构化输出时

JSON.parse

失败，或包含反斜杠的键不正确来源: GitHub Issue #1226 原因: 当使用

responseMimeType: "application/json"

且模式键包含转义反斜杠（例如：

\\a

表示键

\a

）时，模型输出无法保留JSON转义，会输出单个反斜杠，导致无效JSON。

预防措施:

typescript

// 避免在JSON模式键中使用反斜杠
// 或根据需要手动后处理:
let jsonText = response.text;
// 添加自定义转义逻辑（如果需要）

影响范围: 使用包含反斜杠键的结构化输出的Gemini 3模型状态: 已知问题，需要使用解决方法

Issue #6: Large PDFs from S3 Signed URLs Fail with "Document has no pages"

问题#6: 来自S3签名URL的大型PDF失败，提示"Document has no pages"

Error:

ApiError: {"error":{"code":400,"message":"The document has no pages.","status":"INVALID_ARGUMENT"}}

Source: GitHub Issue #1259 Why It Happens: Larger PDFs (e.g., 20MB) from AWS S3 signed URLs fail when passed via

fileData.fileUri

. The API cannot fetch or process the PDF from signed URLs.

Prevention:

typescript

// ❌ WRONG - Fails with large PDFs from S3:
contents: [{
  parts: [{
    fileData: {
      fileUri: 'https://bucket.s3.region.amazonaws.com/file.pdf?X-Amz-Algorithm=...'
    }
  }]
}]

// ✅ CORRECT - Fetch and encode to base64:
const pdfResponse = await fetch(signedUrl);
const pdfBuffer = await pdfResponse.arrayBuffer();
const base64Pdf = Buffer.from(pdfBuffer).toString('base64');

contents: [{
  parts: [{
    inlineData: {
      data: base64Pdf,
      mimeType: 'application/pdf'
    }
  }]
}]

Affected: PDF files from external signed URLs Status: Known limitation, use base64 inline data instead

错误:

ApiError: {"error":{"code":400,"message":"The document has no pages.","status":"INVALID_ARGUMENT"}}

来源: GitHub Issue #1259 原因: 来自AWS S3签名URL的大型PDF（例如20MB）通过

fileData.fileUri

传递时失败。API无法从签名URL获取或处理PDF。

预防措施:

typescript

// ❌ 错误 - 来自S3的大型PDF会失败:
contents: [{
  parts: [{
    fileData: {
      fileUri: 'https://bucket.s3.region.amazonaws.com/file.pdf?X-Amz-Algorithm=...'
    }
  }]
}]

// ✅ 正确 - 获取并编码为base64:
const pdfResponse = await fetch(signedUrl);
const pdfBuffer = await pdfResponse.arrayBuffer();
const base64Pdf = Buffer.from(pdfBuffer).toString('base64');

contents: [{
  parts: [{
    inlineData: {
      data: base64Pdf,
      mimeType: 'application/pdf'
    }
  }]
}]

影响范围: 来自外部签名URL的PDF文件状态: 已知限制，使用base64内联数据替代

Issue #7: 404 NOT_FOUND with Uploaded Video on Gemini 3 Models

问题#7: 在Gemini 3模型上使用上传的视频返回404 NOT_FOUND

Error: 404 NOT_FOUND when using uploaded video files with Gemini 3 models Source: GitHub Issue #1220 Why It Happens: Some Gemini 3 models (

gemini-3-flash-preview

gemini-3-pro-preview

) are not available in the free tier or have limited access even with paid accounts. Video file uploads fail with 404.

Prevention:

typescript

// ❌ WRONG - 404 error with Gemini 3:
const response = await ai.models.generateContent({
  model: 'gemini-3-pro-preview', // 404 error
  contents: [{
    parts: [
      { text: 'Describe this video' },
      { fileData: { fileUri: videoFile.uri }}
    ]
  }]
});

// ✅ CORRECT - Use Gemini 2.5 for video understanding:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash', // Works
  contents: [{
    parts: [
      { text: 'Describe this video' },
      { fileData: { fileUri: videoFile.uri }}
    ]
  }]
});

Affected: Gemini 3 preview models with video uploads Status: Known limitation, use Gemini 2.5 models for video

错误: 在Gemini 3模型上使用上传的视频时返回404 NOT_FOUND 来源: GitHub Issue #1220 原因: 部分Gemini 3模型（

gemini-3-flash-preview

gemini-3-pro-preview

）在免费层不可用，即使是付费账户也可能访问受限。视频文件上传会返回404。

预防措施:

typescript

// ❌ 错误 - Gemini 3会返回404:
const response = await ai.models.generateContent({
  model: 'gemini-3-pro-preview', // 404错误
  contents: [{
    parts: [
      { text: '描述这个视频' },
      { fileData: { fileUri: videoFile.uri }}
    ]
  }]
});

// ✅ 正确 - 使用Gemini 2.5进行视频理解:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash', // 可用
  contents: [{
    parts: [
      { text: '描述这个视频' },
      { fileData: { fileUri: videoFile.uri }}
    ]
  }]
});

影响范围: 使用视频上传的Gemini 3预览模型状态: 已知限制，使用Gemini 2.5模型进行视频处理

Issue #8: Batch API Returns 429 Despite Being Under Quota

问题#8: 批量API在未超出配额时返回429

Error: 429 RESOURCE_EXHAUSTED when using Batch API, even when under documented quota Source: GitHub Issue #1264 Why It Happens: The Batch API may have dynamic rate limiting based on server load or undocumented limits beyond static quotas.

Prevention:

typescript

// Implement exponential backoff for Batch API:
async function batchWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.batches.create(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000;
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Affected: Batch API users on paid tier Status: Under investigation, use retry logic

错误: 使用批量API时返回429 RESOURCE_EXHAUSTED，即使未超出文档记录的配额来源: GitHub Issue #1264 原因: 批量API可能基于服务器负载或文档未记录的限制实施动态速率限制。

预防措施:

typescript

// 为批量API实现指数退避:
async function batchWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.batches.create(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000;
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

影响范围: 付费层的批量API用户状态: 正在调查中，使用重试逻辑

Issue #9: Context Caching Only Works with Gemini 1.5 Models

问题#9: 上下文缓存仅适用于Gemini 1.5模型

Error: 404 NOT FOUND when creating caches with Gemini 2.0, 2.5, or 3.0 models Source: GitHub Issue #339 Why It Happens: Context caching only supports Gemini 1.5 Pro and Gemini 1.5 Flash models. Documentation examples incorrectly show Gemini 2.0+ models.

Prevention:

typescript

// ❌ WRONG - 404 error:
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash', // Not supported
  config: { /* ... */ }
});

// ✅ CORRECT - Use Gemini 1.5 with explicit version:
const cache = await ai.caches.create({
  model: 'gemini-1.5-flash-001', // Explicit version required
  config: { /* ... */ }
});

Affected: All Gemini 2.x and 3.x users trying to use context caching Status: Known limitation, only Gemini 1.5 models support caching

错误: 使用Gemini 2.0、2.5或3.0模型创建缓存时返回404 NOT FOUND 来源: GitHub Issue #339 原因: 上下文缓存仅支持Gemini 1.5 Pro和Gemini 1.5 Flash模型。文档示例错误地展示了Gemini 2.0+模型。

预防措施:

typescript

// ❌ 错误 - 返回404:
const cache = await ai.caches.create({
  model: 'gemini-2.5-flash', // 不支持
  config: { /* ... */ }
});

// ✅ 正确 - 使用Gemini 1.5并指定明确版本:
const cache = await ai.caches.create({
  model: 'gemini-1.5-flash-001', // 需要明确版本
  config: { /* ... */ }
});

影响范围: 尝试使用上下文缓存的所有Gemini 2.x和3.x用户状态: 已知限制，仅Gemini 1.5模型支持缓存

Issue #10: Structured Output Occasionally Returns Backticks Causing JSON.parse Error

问题#10: 结构化输出偶尔返回反引号，导致JSON.parse错误

Error:

SyntaxError: Unexpected token '

when parsing JSON responses **Source**: [GitHub Issue #976](https://github.com/googleapis/js-genai/issues/976) **Why It Happens**: When using

responseMimeType: "application/json"

, the response occasionally includes markdown code fence backticks wrapping the JSON (`` ```json\n{...}\n``` ``), breaking

JSON.parse()`.

Prevention:

typescript

// Strip markdown code fences before parsing:
let jsonText = response.text.trim();

if (jsonText.startsWith('```json')) {
  jsonText = jsonText.replace(/^```json\n/, '').replace(/\n```$/, '');
} else if (jsonText.startsWith('```')) {
  jsonText = jsonText.replace(/^```\n/, '').replace(/\n```$/, '');
}

const data = JSON.parse(jsonText);

Affected: All models when using structured output with

responseMimeType: "application/json"

Status: Known intermittent issue, workaround required

错误: 解析JSON响应时出现

SyntaxError: Unexpected token '

 **来源**: [GitHub Issue #976](https://github.com/googleapis/js-genai/issues/976) **原因**: 当使用

responseMimeType: "application/json"

时，响应偶尔会包含包裹JSON的Markdown代码块反引号（`` ```json\n{...}\n``` ``），导致

JSON.parse()`失败。

预防措施:

typescript

// 解析前去除Markdown代码块:
let jsonText = response.text.trim();

if (jsonText.startsWith('```json')) {
  jsonText = jsonText.replace(/^```json\n/, '').replace(/\n```$/, '');
} else if (jsonText.startsWith('```')) {
  jsonText = jsonText.replace(/^```\n/, '').replace(/\n```$/, '');
}

const data = JSON.parse(jsonText);

影响范围: 使用

responseMimeType: "application/json"

的所有模型状态: 已知间歇性问题，需要使用解决方法

Issue #11: Gemini 3 Temperature Below 1.0 Causes Looping/Degraded Reasoning

问题#11: Gemini 3的temperature低于1.0导致循环/推理质量下降

Error: Infinite loops or degraded reasoning quality on complex tasks Source: Official Troubleshooting Docs Why It Happens: Gemini 3 models are optimized for temperature 1.0. Lowering temperature below 1.0 may cause looping behavior or degraded performance on complex mathematical/reasoning tasks.

Prevention:

typescript

// ❌ WRONG - May cause issues with Gemini 3:
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Solve this complex math problem: ...',
  config: {
    temperature: 0.3 // May cause looping/degradation
  }
});

// ✅ CORRECT - Keep default temperature:
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: 'Solve this complex math problem: ...',
  config: {
    temperature: 1.0 // Recommended for Gemini 3
  }
});
// Or omit temperature config entirely (uses default 1.0)

Affected: Gemini 3 series models Status: Official recommendation, keep temperature at 1.0

错误: 复杂任务出现无限循环或推理质量下降来源: 官方故障排查文档原因: Gemini 3模型针对temperature 1.0进行优化。将temperature设置为1.0以下可能导致循环行为或复杂数学/推理任务的性能下降。

预防措施:

typescript

// ❌ 错误 - Gemini 3可能出现问题:
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: '解决这个复杂的数学问题: ...',
  config: {
    temperature: 0.3 // 可能导致循环/质量下降
  }
});

// ✅ 正确 - 保持默认temperature:
const response = await ai.models.generateContent({
  model: 'gemini-3-flash',
  contents: '解决这个复杂的数学问题: ...',
  config: {
    temperature: 1.0 // Gemini 3推荐值
  }
});
// 或完全省略temperature配置（使用默认值1.0）

影响范围: Gemini 3系列模型状态: 官方建议，保持temperature为1.0

Issue #12: Massive Rate Limit Reductions in December 2025 (Free Tier)

问题#12: 2025年12月免费层速率限制大幅降低

Error: Sudden 429 RESOURCE_EXHAUSTED errors after December 6, 2025 Source: LaoZhang AI Blog | HowToGeek Why It Happens: Google reduced free tier rate limits by 80-90% without wide announcement, catching developers off guard.

Changes:

Gemini 2.5 Pro: 80% reduction in daily requests (100 RPD, was ~250)
Gemini 2.5 Flash: ~20 requests per day (was ~250) - 90% reduction
Free tier now impractical for production

Prevention:

typescript

// For production, upgrade to paid tier:
// https://ai.google.dev/pricing

// For free tier, implement aggressive rate limiting:
const rateLimiter = {
  requests: 0,
  resetTime: Date.now() + 24 * 60 * 60 * 1000,
  async checkLimit() {
    if (Date.now() > this.resetTime) {
      this.requests = 0;
      this.resetTime = Date.now() + 24 * 60 * 60 * 1000;
    }
    if (this.requests >= 20) {
      throw new Error('Daily limit reached');
    }
    this.requests++;
  }
};

await rateLimiter.checkLimit();
const response = await ai.models.generateContent({/* ... */});

Affected: Free tier users (December 6, 2025 onwards) Status: Permanent change, upgrade to paid tier for production

错误: 2025年12月6日后突然出现429 RESOURCE_EXHAUSTED错误来源: LaoZhang AI Blog | HowToGeek 原因: Google在未广泛通知的情况下将免费层速率限制降低了80-90%，让开发者措手不及。

变化:

Gemini 2.5 Pro: 每日请求数减少80%（从约250降至100 RPD）
Gemini 2.5 Flash: 每日请求数减少90%（从约250降至约20 RPD）
免费层现在不适用于生产环境

预防措施:

typescript

// 生产环境请升级到付费层:
// https://ai.google.dev/pricing

// 免费层请实施严格的速率限制:
const rateLimiter = {
  requests: 0,
  resetTime: Date.now() + 24 * 60 * 60 * 1000,
  async checkLimit() {
    if (Date.now() > this.resetTime) {
      this.requests = 0;
      this.resetTime = Date.now() + 24 * 60 * 60 * 1000;
    }
    if (this.requests >= 20) {
      throw new Error('已达到每日限制');
    }
    this.requests++;
  }
};

await rateLimiter.checkLimit();
const response = await ai.models.generateContent({/* ... */});

影响范围: 2025年12月6日之后的免费层用户状态: 永久变更，生产环境请升级到付费层

Issue #13: Preview Models Have No SLAs and Can Change Without Warning

问题#13: 预览模型无SLA，可能随时变更

Error: Unexpected behavior changes, deprecation, or service interruptions Source: Arsturn Blog | Official docs Why It Happens: Preview and experimental models (e.g.,

gemini-2.5-flash-preview

gemini-3-pro-preview

) have no service level agreements (SLAs) and are inherently unstable. Google can change or deprecate them with little notice.

Prevention:

typescript

// ❌ WRONG - Using preview models in production:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash-preview', // No SLA!
  contents: 'Production traffic'
});

// ✅ CORRECT - Use GA (generally available) models:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash', // Stable, with SLA
  contents: 'Production traffic'
});

// Or use specific version numbers for stability:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash-001', // Pinned version
  contents: 'Production traffic'
});

Affected: Users of preview/experimental models in production Status: Known limitation, use GA models for production

错误: 意外的行为变更、弃用或服务中断来源: Arsturn Blog | 官方文档原因: 预览和实验模型（例如

gemini-2.5-flash-preview

gemini-3-pro-preview

）无服务级别协议（SLA），本质上不稳定。Google可能随时变更或弃用这些模型，且通知有限。

预防措施:

typescript

// ❌ 错误 - 生产环境使用预览模型:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash-preview', // 无SLA!
  contents: '生产流量'
});

// ✅ 正确 - 使用正式可用（GA）模型:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash', // 稳定，有SLA
  contents: '生产流量'
});

// 或使用特定版本号以保证稳定性:
const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash-001', // 固定版本
  contents: '生产流量'
});

影响范围: 在生产环境中使用预览/实验模型的用户状态: 已知限制，生产环境请使用GA模型

Issue #14: API Key Leakage Auto-Blocking (Security Enhancement)

问题#14: API密钥泄露自动拦截（安全增强）

Error: "Invalid API key" after accidentally committing key to GitHub Source: AI Free API Blog | Official troubleshooting Why It Happens: Google proactively scans for publicly exposed API keys (e.g., in GitHub repos) and automatically blocks them from accessing the Gemini API as a security measure.

Prevention:

typescript

// Best practices:
// 1. Use .env files (never commit)
// 2. Use environment variables in production
// 3. Rotate keys if exposed
// 4. Use .gitignore:

// .gitignore
.env
.env.local
*.key

Affected: Users who accidentally commit API keys to public repos Status: Security feature, rotate keys if exposed

错误: 意外将密钥提交到GitHub后提示"Invalid API key" 来源: AI Free API Blog | 官方故障排查原因: Google主动扫描公开暴露的API密钥（例如GitHub仓库中的密钥），并自动阻止这些密钥访问Gemini API，作为安全措施。

预防措施:

typescript

// 最佳实践:
// 1. 使用.env文件（永远不要提交到仓库）
// 2. 生产环境使用环境变量
// 3. 如果密钥泄露，立即轮换
// 4. 使用.gitignore:

// .gitignore
.env
.env.local
*.key

影响范围: 意外将API密钥提交到公共仓库的用户状态: 安全功能，密钥泄露后请立即轮换

Error Handling

错误处理

Common Errors

常见错误

1. Invalid API Key (401)

1. 无效API密钥（401）

typescript

{
  error: {
    code: 401,
    message: 'API key not valid. Please pass a valid API key.',
    status: 'UNAUTHENTICATED'
  }
}

Solution: Verify

GEMINI_API_KEY

environment variable is set correctly.

typescript

{
  error: {
    code: 401,
    message: 'API key not valid. Please pass a valid API key.',
    status: 'UNAUTHENTICATED'
  }
}

解决方案: 验证

GEMINI_API_KEY

环境变量是否正确设置。

2. Rate Limit Exceeded (429)

2. 超出速率限制（429）

typescript

{
  error: {
    code: 429,
    message: 'Resource has been exhausted (e.g. check quota).',
    status: 'RESOURCE_EXHAUSTED'
  }
}

Solution: Implement exponential backoff retry strategy.

typescript

{
  error: {
    code: 429,
    message: 'Resource has been exhausted (e.g. check quota).',
    status: 'RESOURCE_EXHAUSTED'
  }
}

解决方案: 实现指数退避重试策略。

3. Model Not Found (404)

3. 模型未找到（404）

typescript

{
  error: {
    code: 404,
    message: 'models/gemini-3.0-flash is not found',
    status: 'NOT_FOUND'
  }
}

Solution: Use correct model names:

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash-lite

typescript

{
  error: {
    code: 404,
    message: 'models/gemini-3.0-flash is not found',
    status: 'NOT_FOUND'
  }
}

解决方案: 使用正确的模型名称:

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash-lite

4. Context Length Exceeded (400)

4. 超出上下文长度（400）

typescript

{
  error: {
    code: 400,
    message: 'Request payload size exceeds the limit',
    status: 'INVALID_ARGUMENT'
  }
}

Solution: Reduce input size. Gemini 2.5 models support 1,048,576 input tokens max.

typescript

{
  error: {
    code: 400,
    message: 'Request payload size exceeds the limit',
    status: 'INVALID_ARGUMENT'
  }
}

解决方案: 减小输入大小。Gemini 2.5模型最大支持1,048,576输入token。

Exponential Backoff Pattern

指数退避模式

typescript

async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

typescript

async function generateWithRetry(request, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await ai.models.generateContent(request);
    } catch (error) {
      if (error.status === 429 && i < maxRetries - 1) {
        const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      throw error;
    }
  }
}

Rate Limits

速率限制

⚠️ December 2025 Update - Major Free Tier Reductions

⚠️ 2025年12月更新 - 免费层大幅缩减

CRITICAL: Google reduced free tier limits by 80-90% on December 6-7, 2025 without wide announcement. Free tier is now primarily for prototyping only.

Sources: LaoZhang AI | HowToGeek

重要提示: Google在2025年12月6-7日未广泛通知的情况下，将免费层限制降低了80-90%。免费层现在主要用于原型开发。

来源: LaoZhang AI | HowToGeek

Free Tier (Gemini API) - Current Limits

免费层（Gemini API）- 当前限制

Rate limits vary by model:

Gemini 2.5 Pro:

Requests per minute: 5 RPM
Tokens per minute: 125,000 TPM
Requests per day: 100 RPD (was ~250 before Dec 2025) - 80% reduction

Gemini 2.5 Flash:

Requests per minute: 10 RPM
Tokens per minute: 250,000 TPM
Requests per day: ~20 RPD (was ~250 before Dec 2025) - 90% reduction

Gemini 2.5 Flash-Lite:

Requests per minute: 15 RPM
Tokens per minute: 250,000 TPM
Requests per day: 1,000 RPD (unchanged)

速率限制因模型而异:

Gemini 2.5 Pro:

每分钟请求数: 5 RPM
每分钟token数: 125,000 TPM
每日请求数: 100 RPD（2025年12月前约为250）- 减少80%

Gemini 2.5 Flash:

每分钟请求数: 10 RPM
每分钟token数: 250,000 TPM
每日请求数: 约20 RPD（2025年12月前约为250）- 减少90%

Gemini 2.5 Flash-Lite:

每分钟请求数: 15 RPM
每分钟token数: 250,000 TPM
每日请求数: 1,000 RPD（无变化）

Paid Tier (Tier 1)

付费层（Tier 1）

Requires billing account linked to your Google Cloud project.

Gemini 2.5 Pro:

Requests per minute: 150 RPM
Tokens per minute: 2,000,000 TPM
Requests per day: 10,000 RPD

Gemini 2.5 Flash:

Requests per minute: 1,000 RPM
Tokens per minute: 1,000,000 TPM
Requests per day: 10,000 RPD

Gemini 2.5 Flash-Lite:

Requests per minute: 4,000 RPM
Tokens per minute: 4,000,000 TPM
Requests per day: Not specified

需要将结算账户链接到您的Google Cloud项目。

Gemini 2.5 Pro:

每分钟请求数: 150 RPM
每分钟token数: 2,000,000 TPM
每日请求数: 10,000 RPD

Gemini 2.5 Flash:

每分钟请求数: 1,000 RPM
每分钟token数: 1,000,000 TPM
每日请求数: 10,000 RPD

Gemini 2.5 Flash-Lite:

每分钟请求数: 4,000 RPM
每分钟token数: 4,000,000 TPM
每日请求数: 未指定

Higher Tiers (Tier 2 & 3)

更高层级（Tier 2 & 3）

Tier 2 (requires $250+ spending and 30-day wait):

Even higher limits available

Tier 3 (requires $1,000+ spending and 30-day wait):

Maximum limits available

Tips:

Implement rate limit handling with exponential backoff
Use batch processing for high-volume tasks
Monitor usage in Google AI Studio
Choose the right model based on your rate limit needs
Official rate limits: https://ai.google.dev/gemini-api/docs/rate-limits

Tier 2（需要月消费250美元以上，等待30天）:

提供更高的限制

Tier 3（需要月消费1,000美元以上，等待30天）:

提供最高限制

提示:

实现带指数退避的速率限制处理
高容量任务使用批量处理
在Google AI Studio中监控使用情况
根据速率限制需求选择合适的模型
官方速率限制: https://ai.google.dev/gemini-api/docs/rate-limits

SDK Migration Guide

SDK迁移指南

From @google/generative-ai to @google/genai

从@google/generative-ai迁移到@google/genai

1. Update Package

1. 更新包

bash

undefined

bash

undefined

Remove deprecated SDK

移除已弃用的SDK

npm uninstall @google/generative-ai

Install current SDK

安装当前SDK

npm install @google/genai@1.27.0

undefined

npm install @google/genai@1.27.0

undefined

2. Update Imports

2. 更新导入

Old (DEPRECATED):

typescript

import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });

New (CURRENT):

typescript

import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// Use ai.models.generateContent() directly

旧版（已弃用）:

typescript

import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });

新版（当前）:

typescript

import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// 直接使用ai.models.generateContent()

3. Update API Calls

3. 更新API调用

Old:

typescript

const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();

New:

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: prompt
});
const text = response.text;

旧版:

typescript

const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();

新版:

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: prompt
});
const text = response.text;

4. Update Streaming

4. 更新流式输出

Old:

typescript

const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
  console.log(chunk.text());
}

New:

typescript

const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: prompt
});
for await (const chunk of response) {
  console.log(chunk.text);
}

旧版:

typescript

const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
  console.log(chunk.text());
}

新版:

typescript

const response = await ai.models.generateContentStream({
  model: 'gemini-2.5-flash',
  contents: prompt
});
for await (const chunk of response) {
  console.log(chunk.text);
}

5. Update Chat

5. 更新对话

Old:

typescript

const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;

New:

typescript

const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text is directly available

旧版:

typescript

const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;

新版:

typescript

const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text可直接获取

Production Best Practices

生产环境最佳实践

1. Always Do

1. 必须遵守

✅ Use @google/genai (NOT @google/generative-ai) ✅ Set maxOutputTokens to prevent excessive generation ✅ Implement rate limit handling with exponential backoff ✅ Use environment variables for API keys (never hardcode) ✅ Validate inputs before sending to API (save costs) ✅ Use streaming for better UX on long responses ✅ Choose the right model based on your needs (Pro for complex reasoning, Flash for balance, Flash-Lite for speed) ✅ Handle errors gracefully with try-catch ✅ Monitor token usage for cost control ✅ Use correct model names: gemini-2.5-pro/flash/flash-lite

✅ 使用@google/genai（请勿使用@google/generative-ai） ✅ 设置maxOutputTokens以避免过度生成 ✅ 实现带指数退避的速率限制处理 ✅ 使用环境变量存储API密钥（永远不要硬编码） ✅ API调用前验证输入（节省成本） ✅ 使用流式输出提升长响应的用户体验 ✅ 根据需求选择合适的模型（Pro用于复杂推理，Flash用于平衡，Flash-Lite用于速度） ✅ 优雅处理错误（使用try-catch） ✅ 监控token使用以控制成本 ✅ 使用正确的模型名称: gemini-2.5-pro/flash/flash-lite

2. Never Do

2. 禁止操作

❌ Never use @google/generative-ai (deprecated!) ❌ Never hardcode API keys in code ❌ Never claim 2M context for Gemini 2.5 (it's 1,048,576 input tokens) ❌ Never expose API keys in client-side code ❌ Never skip error handling (always try-catch) ❌ Never use generic rate limits (each model has different limits - check official docs) ❌ Never send PII without user consent ❌ Never trust user input without validation ❌ Never ignore rate limits (will get 429 errors) ❌ Never use old model names like gemini-1.5-pro (use 2.5 models)

❌ 永远不要使用@google/generative-ai（已弃用！） ❌ 永远不要在代码中硬编码API密钥 ❌ 永远不要声称Gemini 2.5有200万token上下文窗口（实际为1,048,576输入token） ❌ 永远不要在客户端代码中暴露API密钥 ❌ 永远不要跳过错误处理（始终使用try-catch） ❌ 永远不要使用通用速率限制（每个模型的限制不同，请查看官方文档） ❌ 未经用户同意永远不要发送个人身份信息（PII） ❌ 永远不要信任未验证的用户输入 ❌ 永远不要忽略速率限制（会收到429错误） ❌ 永远不要使用旧模型名称如gemini-1.5-pro（请使用2.5系列模型）

3. Security

3. 安全

API Key Storage: Use environment variables or secret managers
Server-Side Only: Never expose API keys in browser JavaScript
Input Validation: Sanitize all user inputs before API calls
Rate Limiting: Implement your own rate limits to prevent abuse
Error Messages: Don't expose API keys or sensitive data in error logs

API密钥存储: 使用环境变量或密钥管理器
仅在服务端使用: 永远不要在浏览器JavaScript中暴露API密钥
输入验证: API调用前清理所有用户输入
速率限制: 实现自己的速率限制以防止滥用
错误消息: 错误日志中不要暴露API密钥或敏感数据

4. Cost Optimization

4. 成本优化

Choose Right Model: Use Flash for most tasks, Pro only when needed
Set Token Limits: Use maxOutputTokens to control costs
Batch Requests: Process multiple items efficiently
Cache Results: Store responses when appropriate
Monitor Usage: Track token consumption in Google Cloud Console

选择合适的模型: 大多数任务使用Flash，仅在需要时使用Pro
设置token限制: 使用maxOutputTokens控制成本
批量请求: 高效处理多个项目
缓存结果: 适当时存储响应
监控使用情况: 在Google Cloud Console中跟踪token消耗

5. Performance

5. 性能

Use Streaming: Better perceived latency for long responses
Parallel Requests: Use Promise.all() for independent calls
Edge Deployment: Deploy to Cloudflare Workers for low latency
Connection Pooling: Reuse HTTP connections when possible

使用流式输出: 提升长响应的感知延迟
并行请求: 使用Promise.all()处理独立调用
边缘部署: 部署到Cloudflare Workers以降低延迟
连接池: 可能的话复用HTTP连接

Quick Reference

快速参考

Installation

安装

bash

npm install @google/genai@1.34.0

bash

npm install @google/genai@1.34.0

Environment

环境配置

bash

export GEMINI_API_KEY="..."

bash

export GEMINI_API_KEY="..."

Models (2025-2026)

模型（2025-2026）

```
gemini-3-flash
```
(1,048,576 in / 65,536 out) - NEW Best speed+quality balance
```
gemini-2.5-pro
```
(1,048,576 in / 65,536 out) - Best for complex reasoning
```
gemini-2.5-flash
```
(1,048,576 in / 65,536 out) - Proven price-performance balance
```
gemini-2.5-flash-lite
```
(1,048,576 in / 65,536 out) - Fastest, most cost-effective

```
gemini-3-flash
```
（1,048,576输入 / 65,536输出）- 新增最佳速度+质量平衡
```
gemini-2.5-pro
```
（1,048,576输入 / 65,536输出）- 复杂推理最佳选择
```
gemini-2.5-flash
```
（1,048,576输入 / 65,536输出）- 经过验证的性价比平衡
```
gemini-2.5-flash-lite
```
（1,048,576输入 / 65,536输出）- 最快、最具成本效益

Basic Generation

基础生成

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: 'Your prompt here'
});
console.log(response.text);

typescript

const response = await ai.models.generateContent({
  model: 'gemini-2.5-flash',
  contents: '您的提示词'
});
console.log(response.text);

Streaming

流式输出

typescript

const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
  console.log(chunk.text);
}

typescript

const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
  console.log(chunk.text);
}

Multimodal

多模态

typescript

contents: [
  {
    parts: [
      { text: 'What is this?' },
      { inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
    ]
  }
]

typescript

contents: [
  {
    parts: [
      { text: '这是什么？' },
      { inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
    ]
  }
]

Function Calling

函数调用

typescript

config: {
  tools: [{ functionDeclarations: [...] }]
}

Last Updated: 2026-01-21 Production Validated: All features tested with @google/genai@1.35.0 Phase: 2 Complete ✅ (All Core + Advanced Features) Known Issues: 14 documented errors prevented Changes: Added Known Issues Prevention section with 14 community-researched findings from post-training-cutoff period (May 2025-Jan 2026)

typescript

config: {
  tools: [{ functionDeclarations: [...] }]
}

最后更新: 2026-01-21 生产环境验证: 所有功能已通过@google/genai@1.35.0测试阶段: 第二阶段完成 ✅（所有核心+高级功能） 已知问题: 已预防14个已记录的错误变更: 新增已知问题预防部分，包含14个社区研究的发现（2025年5月-2026年1月，模型训练截止日期之后的内容）

google-gemini-api

Original

Translation

Google Gemini API - Complete Guide

Google Gemini API - 完整指南

⚠️ CRITICAL SDK MIGRATION WARNING

⚠️ 重要SDK迁移警告

Status

状态

Table of Contents

目录

Quick Start

快速开始

Installation

安装

Environment Setup

环境配置

First Text Generation (Node.js SDK)

首次文本生成（Node.js SDK）

First Text Generation (Fetch - Cloudflare Workers)

首次文本生成（Fetch - Cloudflare Workers）

Current Models (2025)

当前模型（2025）

Gemini 3 Series (December 2025)

Gemini 3系列（2025年12月）

gemini-3-flash

gemini-3-flash

gemini-3-pro-preview

gemini-3-pro-preview

Gemini 2.5 Series (General Availability - Stable)

Gemini 2.5系列（正式可用 - 稳定版）

gemini-2.5-pro

gemini-2.5-pro

gemini-2.5-flash

gemini-2.5-flash

gemini-2.5-flash-lite

gemini-2.5-flash-lite

Model Feature Matrix

模型特性矩阵

⚠️ Context Window Correction

⚠️ 上下文窗口纠正

SDK vs Fetch Approaches

SDK vs Fetch实现方式

Node.js SDK (@google/genai)

Node.js SDK（@google/genai）

Fetch-based (Direct REST API)

基于Fetch的实现（直接调用REST API）

Text Generation

文本生成

Basic Text Generation (SDK)

基础文本生成（SDK）

Basic Text Generation (Fetch)

基础文本生成（Fetch）

Response Structure

响应结构

Streaming

流式输出

Streaming with SDK (Async Iteration)

使用SDK的流式输出（异步迭代）

Streaming with Fetch (SSE Parsing)

使用Fetch的流式输出（SSE解析）

Multimodal Inputs

多模态输入

Images (Vision)

图片（视觉）

SDK Approach

SDK实现方式

Fetch Approach

Fetch实现方式

Video

视频

Audio

音频

PDFs

PDF

Multiple Inputs

多输入混合

Function Calling

函数调用

Basic Function Calling (SDK)