google-gemini-api
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGoogle Gemini API - Complete Guide
Google Gemini API - 完整指南
Version: 3.0.0 (14 Known Issues Added)
Package: @google/genai@1.35.0 (⚠️ NOT @google/generative-ai)
Last Updated: 2026-01-21
版本: 3.0.0(新增14个已知问题)
包: @google/genai@1.35.0(⚠️ 请勿使用@google/generative-ai)
最后更新: 2026-01-21
⚠️ CRITICAL SDK MIGRATION WARNING
⚠️ 重要SDK迁移警告
DEPRECATED SDK: (sunset November 30, 2025)
CURRENT SDK: v1.27+
@google/generative-ai@google/genaiIf you see code using , it's outdated!
@google/generative-aiThis skill uses the correct current SDK and provides a complete migration guide.
已弃用SDK: (2025年11月30日停止服务)
当前SDK: v1.27+
@google/generative-ai@google/genai如果您看到使用的代码,说明它已过时!
@google/generative-ai本指南使用正确的当前SDK,并提供完整的迁移指南。
Status
状态
✅ Phase 1 Complete:
- ✅ Text Generation (basic + streaming)
- ✅ Multimodal Inputs (images, video, audio, PDFs)
- ✅ Function Calling (basic + parallel execution)
- ✅ System Instructions & Multi-turn Chat
- ✅ Thinking Mode Configuration
- ✅ Generation Parameters (temperature, top-p, top-k, stop sequences)
- ✅ Both Node.js SDK (@google/genai) and fetch approaches
✅ Phase 2 Complete:
- ✅ Context Caching (cost optimization with TTL-based caching)
- ✅ Code Execution (built-in Python interpreter and sandbox)
- ✅ Grounding with Google Search (real-time web information + citations)
📦 Separate Skills:
- Embeddings: See skill for text-embedding-004
google-gemini-embeddings
✅ 第一阶段完成:
- ✅ 文本生成(基础版+流式输出)
- ✅ 多模态输入(图片、视频、音频、PDF)
- ✅ 函数调用(基础版+并行执行)
- ✅ 系统指令与多轮对话
- ✅ 思考模式配置
- ✅ 生成参数(temperature、top-p、top-k、停止序列)
- ✅ Node.js SDK(@google/genai)和Fetch两种实现方式
✅ 第二阶段完成:
- ✅ 上下文缓存(基于TTL的缓存优化成本)
- ✅ 代码执行(内置Python解释器和沙箱)
- ✅ 基于Google搜索的事实校验(实时网络信息+引用)
📦 独立技能:
- 嵌入: 文本嵌入功能请查看技能(对应text-embedding-004模型)
google-gemini-embeddings
Table of Contents
目录
Phase 1 - Core Features:
- Quick Start
- Current Models (2025)
- SDK vs Fetch Approaches
- Text Generation
- Streaming
- Multimodal Inputs
- Function Calling
- System Instructions
- Multi-turn Chat
- Thinking Mode
- Generation Configuration
Phase 2 - Advanced Features:
12. Context Caching
13. Code Execution
14. Grounding with Google Search
Common Reference:
15. Known Issues Prevention
16. Error Handling
17. Rate Limits
18. SDK Migration Guide
19. Production Best Practices
Quick Start
快速开始
Installation
安装
CORRECT SDK:
bash
npm install @google/genai@1.34.0❌ WRONG (DEPRECATED):
bash
npm install @google/generative-ai # DO NOT USE!正确的SDK:
bash
npm install @google/genai@1.34.0❌ 错误(已弃用):
bash
npm install @google/generative-ai # 请勿使用!Environment Setup
环境配置
bash
export GEMINI_API_KEY="..."Or create file:
.envGEMINI_API_KEY=...bash
export GEMINI_API_KEY="..."或创建文件:
.envGEMINI_API_KEY=...First Text Generation (Node.js SDK)
首次文本生成(Node.js SDK)
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Explain quantum computing in simple terms'
});
console.log(response.text);typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '用简单的语言解释量子计算'
});
console.log(response.text);First Text Generation (Fetch - Cloudflare Workers)
首次文本生成(Fetch - Cloudflare Workers)
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Explain quantum computing in simple terms' }] }]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: '用简单的语言解释量子计算' }] }]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);Current Models (2025)
当前模型(2025)
Gemini 3 Series (December 2025)
Gemini 3系列(2025年12月)
gemini-3-flash
gemini-3-flash
- Context: 1,048,576 input tokens / 65,536 output tokens
- Status: 🆕 Generally Available (December 2025)
- Description: Google's fastest and most efficient Gemini 3 model for production workloads
- Best for: High-throughput applications, low-latency responses, cost-sensitive production
- Features: Enhanced multimodal, function calling, streaming, thinking mode
- Benchmark Performance: Matches gemini-2.5-pro quality at gemini-2.5-flash speed/cost
- Recommended for: Production use cases requiring speed + quality balance
- 上下文窗口: 1,048,576输入token / 65,536输出token
- 状态: 🆕 正式可用(2025年12月)
- 描述: Google最快、最高效的Gemini 3模型,适用于生产工作负载
- 最佳适用场景: 高吞吐量应用、低延迟响应、对成本敏感的生产环境
- 特性: 增强型多模态、函数调用、流式输出、思考模式
- 基准性能: 达到gemini-2.5-pro的质量,同时拥有gemini-2.5-flash的速度和成本优势
- 推荐: 需要速度与质量平衡的生产场景
gemini-3-pro-preview
gemini-3-pro-preview
- Context: TBD (documentation pending)
- Status: Preview release (November 18, 2025)
- Description: Google's newest and most intelligent AI model with state-of-the-art reasoning
- Best for: Most complex reasoning tasks, advanced multimodal understanding, benchmark-critical applications
- Features: Enhanced multimodal (text, image, video, audio, PDF), function calling, streaming
- Benchmark Performance: Outperforms Gemini 2.5 Pro on every major AI benchmark
- ⚠️ Preview Models Warning: Preview models have NO SLAs and can change or be deprecated with little notice. Use GA (generally available) models for production. See Issue #13
- 上下文窗口: 待定(文档未发布)
- 状态: 预览版(2025年11月18日发布)
- 描述: Google最新、最智能的AI模型,具备最先进的推理能力
- 最佳适用场景: 最复杂的推理任务、高级多模态理解、对基准要求严格的应用
- 特性: 增强型多模态(文本、图片、视频、音频、PDF)、函数调用、流式输出
- 基准性能: 在所有主要AI基准测试中优于Gemini 2.5 Pro
- ⚠️ 预览模型警告: 预览模型无服务级别协议(SLA),可能随时变更或弃用。生产环境请使用正式可用(GA)模型。详情请查看问题#13
Gemini 2.5 Series (General Availability - Stable)
Gemini 2.5系列(正式可用 - 稳定版)
gemini-2.5-pro
gemini-2.5-pro
- Context: 1,048,576 input tokens / 65,536 output tokens
- Description: State-of-the-art thinking model for complex reasoning
- Best for: Code, math, STEM, complex problem-solving
- Features: Thinking mode (default on), function calling, multimodal, streaming
- Knowledge cutoff: January 2025
- 上下文窗口: 1,048,576输入token / 65,536输出token
- 描述: 具备最先进思考能力的模型,适用于复杂推理
- 最佳适用场景: 代码、数学、STEM、复杂问题解决
- 特性: 思考模式(默认开启)、函数调用、多模态、流式输出
- 知识截止日期: 2025年1月
gemini-2.5-flash
gemini-2.5-flash
- Context: 1,048,576 input tokens / 65,536 output tokens
- Description: Best price-performance workhorse model
- Best for: Large-scale processing, low-latency, high-volume, agentic use cases
- Features: Thinking mode (default on), function calling, multimodal, streaming
- Knowledge cutoff: January 2025
- 上下文窗口: 1,048,576输入token / 65,536输出token
- 描述: 性价比最高的主力模型
- 最佳适用场景: 大规模处理、低延迟、高吞吐量、智能代理场景
- 特性: 思考模式(默认开启)、函数调用、多模态、流式输出
- 知识截止日期: 2025年1月
gemini-2.5-flash-lite
gemini-2.5-flash-lite
- Context: 1,048,576 input tokens / 65,536 output tokens
- Description: Cost-optimized, fastest 2.5 model
- Best for: High throughput, cost-sensitive applications
- Features: Thinking mode (default on), function calling, multimodal, streaming
- Knowledge cutoff: January 2025
- 上下文窗口: 1,048,576输入token / 65,536输出token
- 描述: 成本优化的最快2.5系列模型
- 最佳适用场景: 高吞吐量、对成本敏感的应用
- 特性: 思考模式(默认开启)、函数调用、多模态、流式输出
- 知识截止日期: 2025年1月
Model Feature Matrix
模型特性矩阵
| Feature | 3-Flash | 3-Pro (Preview) | 2.5-Pro | 2.5-Flash | 2.5-Flash-Lite |
|---|---|---|---|---|---|
| Thinking Mode | ✅ Default ON | TBD | ✅ Default ON | ✅ Default ON | ✅ Default ON |
| Function Calling | ✅ | ✅ | ✅ | ✅ | ✅ |
| Multimodal | ✅ Enhanced | ✅ Enhanced | ✅ | ✅ | ✅ |
| Streaming | ✅ | ✅ | ✅ | ✅ | ✅ |
| System Instructions | ✅ | ✅ | ✅ | ✅ | ✅ |
| Context Window | 1,048,576 in | TBD | 1,048,576 in | 1,048,576 in | 1,048,576 in |
| Output Tokens | 65,536 max | TBD | 65,536 max | 65,536 max | 65,536 max |
| Status | GA | Preview | Stable | Stable | Stable |
| 特性 | 3-Flash | 3-Pro(预览版) | 2.5-Pro | 2.5-Flash | 2.5-Flash-Lite |
|---|---|---|---|---|---|
| 思考模式 | ✅ 默认开启 | 待定 | ✅ 默认开启 | ✅ 默认开启 | ✅ 默认开启 |
| 函数调用 | ✅ | ✅ | ✅ | ✅ | ✅ |
| 多模态 | ✅ 增强型 | ✅ 增强型 | ✅ | ✅ | ✅ |
| 流式输出 | ✅ | ✅ | ✅ | ✅ | ✅ |
| 系统指令 | ✅ | ✅ | ✅ | ✅ | ✅ |
| 输入上下文窗口 | 1,048,576 | 待定 | 1,048,576 | 1,048,576 | 1,048,576 |
| 最大输出token | 65,536 | 待定 | 65,536 | 65,536 | 65,536 |
| 状态 | 正式可用 | 预览版 | 稳定版 | 稳定版 | 稳定版 |
⚠️ Context Window Correction
⚠️ 上下文窗口纠正
ACCURATE (Gemini 2.5): Gemini 2.5 models support 1,048,576 input tokens (NOT 2M!)
OUTDATED: Only Gemini 1.5 Pro (previous generation) had 2M token context window
GEMINI 3: Context window specifications pending official documentation
Common mistake: Claiming Gemini 2.5 has 2M tokens. It doesn't. This skill prevents this error.
准确信息(Gemini 2.5): Gemini 2.5模型支持1,048,576输入token(不是200万!)
过时信息: 只有上一代的Gemini 1.5 Pro拥有200万token上下文窗口
GEMINI 3: 上下文窗口规格等待官方文档发布
常见错误: 声称Gemini 2.5拥有200万token上下文窗口,实际并非如此。本指南可避免此错误。
SDK vs Fetch Approaches
SDK vs Fetch实现方式
Node.js SDK (@google/genai)
Node.js SDK(@google/genai)
Pros:
- Type-safe with TypeScript
- Easier API (simpler syntax)
- Built-in chat helpers
- Automatic SSE parsing for streaming
- Better error handling
Cons:
- Requires Node.js or compatible runtime
- Larger bundle size
- May not work in all edge runtimes
Use when: Building Node.js apps, Next.js Server Actions/Components, or any environment with Node.js compatibility
优点:
- TypeScript类型安全
- API更易用(语法更简洁)
- 内置对话助手
- 自动解析流式输出的SSE
- 更好的错误处理
缺点:
- 需要Node.js或兼容运行时
- 包体积更大
- 可能无法在所有边缘运行时工作
适用场景: 构建Node.js应用、Next.js Server Actions/Components,或任何兼容Node.js的环境
Fetch-based (Direct REST API)
基于Fetch的实现(直接调用REST API)
Pros:
- Works in any JavaScript environment (Cloudflare Workers, Deno, Bun, browsers)
- Minimal dependencies
- Smaller bundle size
- Full control over requests
Cons:
- More verbose syntax
- Manual SSE parsing for streaming
- No built-in chat helpers
- Manual error handling
Use when: Deploying to Cloudflare Workers, browser clients, or lightweight edge runtimes
优点:
- 可在任何JavaScript环境中运行(Cloudflare Workers、Deno、Bun、浏览器)
- 依赖极少
- 包体积更小
- 完全控制请求过程
缺点:
- 语法更冗长
- 需要手动解析流式输出的SSE
- 无内置对话助手
- 需要手动处理错误
适用场景: 部署到Cloudflare Workers、浏览器客户端,或轻量级边缘运行时
Text Generation
文本生成
Basic Text Generation (SDK)
基础文本生成(SDK)
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Write a haiku about artificial intelligence'
});
console.log(response.text);typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '写一首关于人工智能的俳句'
});
console.log(response.text);Basic Text Generation (Fetch)
基础文本生成(Fetch)
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
parts: [
{ text: 'Write a haiku about artificial intelligence' }
]
}
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
parts: [
{ text: '写一首关于人工智能的俳句' }
]
}
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);Response Structure
响应结构
typescript
{
text: string, // Convenience accessor for text content
candidates: [
{
content: {
parts: [
{ text: string } // Generated text
],
role: string // "model"
},
finishReason: string, // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
index: number
}
],
usageMetadata: {
promptTokenCount: number,
candidatesTokenCount: number,
totalTokenCount: number
}
}typescript
{
text: string, // 文本内容的便捷访问器
candidates: [
{
content: {
parts: [
{ text: string } // 生成的文本
],
role: string // "model"
},
finishReason: string, // "STOP" | "MAX_TOKENS" | "SAFETY" | "OTHER"
index: number
}
],
usageMetadata: {
promptTokenCount: number,
candidatesTokenCount: number,
totalTokenCount: number
}
}Streaming
流式输出
Streaming with SDK (Async Iteration)
使用SDK的流式输出(异步迭代)
typescript
const response = await ai.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: 'Write a 200-word story about time travel'
});
for await (const chunk of response) {
process.stdout.write(chunk.text);
}typescript
const response = await ai.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: '写一个200字左右的时间旅行故事'
});
for await (const chunk of response) {
process.stdout.write(chunk.text);
}Streaming with Fetch (SSE Parsing)
使用Fetch的流式输出(SSE解析)
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Write a 200-word story about time travel' }] }]
}),
}
);
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() || '';
for (const line of lines) {
if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
if (!line.startsWith('data: ')) continue;
try {
const data = JSON.parse(line.slice(6));
const text = data.candidates[0]?.content?.parts[0]?.text;
if (text) {
process.stdout.write(text);
}
} catch (e) {
// Skip invalid JSON
}
}
}Key Points:
- Use endpoint (not
streamGenerateContent)generateContent - Parse Server-Sent Events (SSE) format:
data: {json}\n\n - Handle incomplete chunks in buffer
- Skip empty lines and markers
[DONE]
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: '写一个200字左右的时间旅行故事' }] }]
}),
}
);
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop() || '';
for (const line of lines) {
if (line.trim() === '' || line.startsWith('data: [DONE]')) continue;
if (!line.startsWith('data: ')) continue;
try {
const data = JSON.parse(line.slice(6));
const text = data.candidates[0]?.content?.parts[0]?.text;
if (text) {
process.stdout.write(text);
}
} catch (e) {
// 跳过无效JSON
}
}
}关键点:
- 使用端点(而非
streamGenerateContent)generateContent - 解析Server-Sent Events(SSE)格式:
data: {json}\n\n - 处理缓冲区中的不完整块
- 跳过空行和标记
[DONE]
Multimodal Inputs
多模态输入
Gemini 2.5 models support text + images + video + audio + PDFs in the same request.
Gemini 2.5模型支持在同一个请求中混合文本+图片+视频+音频+PDF。
Images (Vision)
图片(视觉)
SDK Approach
SDK实现方式
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// From file
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'What is in this image?' },
{
inlineData: {
data: base64Image,
mimeType: 'image/jpeg'
}
}
]
}
]
});
console.log(response.text);typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// 从文件读取
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: '这张图片里有什么?' },
{
inlineData: {
data: base64Image,
mimeType: 'image/jpeg'
}
}
]
}
]
});
console.log(response.text);Fetch Approach
Fetch实现方式
typescript
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
parts: [
{ text: 'What is in this image?' },
{
inlineData: {
data: base64Image,
mimeType: 'image/jpeg'
}
}
]
}
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);Supported Image Formats:
- JPEG (,
.jpg).jpeg - PNG ()
.png - WebP ()
.webp - HEIC ()
.heic - HEIF ()
.heif
Max Image Size: 20MB per image
typescript
const imageData = fs.readFileSync('/path/to/image.jpg');
const base64Image = imageData.toString('base64');
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
parts: [
{ text: '这张图片里有什么?' },
{
inlineData: {
data: base64Image,
mimeType: 'image/jpeg'
}
}
]
}
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);支持的图片格式:
- JPEG(,
.jpg).jpeg - PNG()
.png - WebP()
.webp - HEIC()
.heic - HEIF()
.heif
单张图片最大尺寸: 20MB
Video
视频
typescript
// Video must be < 2 minutes for inline data
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'Describe what happens in this video' },
{
inlineData: {
data: base64Video,
mimeType: 'video/mp4'
}
}
]
}
]
});
console.log(response.text);Supported Video Formats:
- MP4 ()
.mp4 - MPEG ()
.mpeg - MOV ()
.mov - AVI ()
.avi - FLV ()
.flv - MPG ()
.mpg - WebM ()
.webm - WMV ()
.wmv
Max Video Length (inline): 2 minutes
Max Video Size: 2GB (use File API for larger files - Phase 2)
typescript
// 内联数据的视频时长必须<2分钟
const videoData = fs.readFileSync('/path/to/video.mp4');
const base64Video = videoData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: '描述这个视频里发生了什么' },
{
inlineData: {
data: base64Video,
mimeType: 'video/mp4'
}
}
]
}
]
});
console.log(response.text);支持的视频格式:
- MP4()
.mp4 - MPEG()
.mpeg - MOV()
.mov - AVI()
.avi - FLV()
.flv - MPG()
.mpg - WebM()
.webm - WMV()
.wmv
内联视频最大时长: 2分钟
视频最大尺寸: 2GB(更大文件请使用File API - 第二阶段功能)
Audio
音频
typescript
const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'Transcribe and summarize this audio' },
{
inlineData: {
data: base64Audio,
mimeType: 'audio/mp3'
}
}
]
}
]
});
console.log(response.text);Supported Audio Formats:
- MP3 ()
.mp3 - WAV ()
.wav - FLAC ()
.flac - AAC ()
.aac - OGG ()
.ogg - OPUS ()
.opus
Max Audio Size: 20MB
typescript
const audioData = fs.readFileSync('/path/to/audio.mp3');
const base64Audio = audioData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: '转录并总结这段音频' },
{
inlineData: {
data: base64Audio,
mimeType: 'audio/mp3'
}
}
]
}
]
});
console.log(response.text);支持的音频格式:
- MP3()
.mp3 - WAV()
.wav - FLAC()
.flac - AAC()
.aac - OGG()
.ogg - OPUS()
.opus
音频最大尺寸: 20MB
PDFs
typescript
const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'Summarize the key points in this PDF' },
{
inlineData: {
data: base64Pdf,
mimeType: 'application/pdf'
}
}
]
}
]
});
console.log(response.text);Max PDF Size: 30MB
PDF Limitations: Text-based PDFs work best; scanned images may have lower accuracy
typescript
const pdfData = fs.readFileSync('/path/to/document.pdf');
const base64Pdf = pdfData.toString('base64');
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: '总结这份PDF的关键点' },
{
inlineData: {
data: base64Pdf,
mimeType: 'application/pdf'
}
}
]
}
]
});
console.log(response.text);PDF最大尺寸: 30MB
PDF限制: 基于文本的PDF效果最佳;扫描件图片的识别准确率可能较低
Multiple Inputs
多输入混合
You can combine multiple modalities in one request:
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: 'Compare these two images and describe the differences:' },
{ inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
{ inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
]
}
]
});您可以在一个请求中组合多种模态:
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
{
parts: [
{ text: '对比这两张图片,描述它们的区别:' },
{ inlineData: { data: base64Image1, mimeType: 'image/jpeg' } },
{ inlineData: { data: base64Image2, mimeType: 'image/jpeg' } }
]
}
]
});Function Calling
函数调用
Gemini supports function calling (tool use) to connect models with external APIs and systems.
Gemini支持函数调用(工具使用),将模型与外部API和系统连接。
Basic Function Calling (SDK)
基础函数调用(SDK)
typescript
import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Define function declarations
const getCurrentWeather = {
name: 'get_current_weather',
description: 'Get the current weather for a location',
parametersJsonSchema: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name, e.g. San Francisco'
},
unit: {
type: 'string',
enum: ['celsius', 'fahrenheit']
}
},
required: ['location']
}
};
// Make request with tools
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What\'s the weather in Tokyo?',
config: {
tools: [
{ functionDeclarations: [getCurrentWeather] }
]
}
});
// Check if model wants to call a function
const functionCall = response.candidates[0].content.parts[0].functionCall;
if (functionCall) {
console.log('Function to call:', functionCall.name);
console.log('Arguments:', functionCall.args);
// Execute the function (your implementation)
const weatherData = await fetchWeather(functionCall.args.location);
// Send function result back to model
const finalResponse = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
'What\'s the weather in Tokyo?',
response.candidates[0].content, // Original assistant response with function call
{
parts: [
{
functionResponse: {
name: functionCall.name,
response: weatherData
}
}
]
}
],
config: {
tools: [
{ functionDeclarations: [getCurrentWeather] }
]
}
});
console.log(finalResponse.text);
}typescript
import { GoogleGenAI, FunctionCallingConfigMode } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// 定义函数声明
const getCurrentWeather = {
name: 'get_current_weather',
description: '获取指定地点的当前天气',
parametersJsonSchema: {
type: 'object',
properties: {
location: {
type: 'string',
description: '城市名称,例如:旧金山'
},
unit: {
type: 'string',
enum: ['celsius', 'fahrenheit']
}
},
required: ['location']
}
};
// 携带工具发起请求
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '东京的天气怎么样?',
config: {
tools: [
{ functionDeclarations: [getCurrentWeather] }
]
}
});
// 检查模型是否需要调用函数
const functionCall = response.candidates[0].content.parts[0].functionCall;
if (functionCall) {
console.log('需要调用的函数:', functionCall.name);
console.log('参数:', functionCall.args);
// 执行函数(您的实现逻辑)
const weatherData = await fetchWeather(functionCall.args.location);
// 将函数结果返回给模型
const finalResponse = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: [
'东京的天气怎么样?',
response.candidates[0].content, // 原始助手响应(包含函数调用)
{
parts: [
{
functionResponse: {
name: functionCall.name,
response: weatherData
}
}
]
}
],
config: {
tools: [
{ functionDeclarations: [getCurrentWeather] }
]
}
});
console.log(finalResponse.text);
}Function Calling (Fetch)
函数调用(Fetch)
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: 'What\'s the weather in Tokyo?' }] }
],
tools: [
{
functionDeclarations: [
{
name: 'get_current_weather',
description: 'Get the current weather for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name'
}
},
required: ['location']
}
}
]
}
]
}),
}
);
const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;
if (functionCall) {
// Execute function and send result back (same flow as SDK)
}typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: '东京的天气怎么样?' }] }
],
tools: [
{
functionDeclarations: [
{
name: 'get_current_weather',
description: '获取指定地点的当前天气',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: '城市名称'
}
},
required: ['location']
}
}
]
}
]
}),
}
);
const data = await response.json();
const functionCall = data.candidates[0]?.content?.parts[0]?.functionCall;
if (functionCall) {
// 执行函数并返回结果(流程与SDK相同)
}Parallel Function Calling
并行函数调用
Gemini can call multiple independent functions simultaneously:
typescript
const tools = [
{
functionDeclarations: [
{
name: 'get_weather',
description: 'Get weather for a location',
parametersJsonSchema: {
type: 'object',
properties: {
location: { type: 'string' }
},
required: ['location']
}
},
{
name: 'get_population',
description: 'Get population of a city',
parametersJsonSchema: {
type: 'object',
properties: {
city: { type: 'string' }
},
required: ['city']
}
}
]
}
];
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What is the weather and population of Tokyo?',
config: { tools }
});
// Model may return MULTIPLE function calls in parallel
const functionCalls = response.candidates[0].content.parts.filter(
part => part.functionCall
);
console.log(`Model wants to call ${functionCalls.length} functions in parallel`);Gemini可以同时调用多个独立函数:
typescript
const tools = [
{
functionDeclarations: [
{
name: 'get_weather',
description: '获取指定地点的天气',
parametersJsonSchema: {
type: 'object',
properties: {
location: { type: 'string' }
},
required: ['location']
}
},
{
name: 'get_population',
description: '获取指定城市的人口',
parametersJsonSchema: {
type: 'object',
properties: {
city: { type: 'string' }
},
required: ['city']
}
}
]
}
];
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '东京的天气和人口分别是多少?',
config: { tools }
});
// 模型可能返回多个并行的函数调用
const functionCalls = response.candidates[0].content.parts.filter(
part => part.functionCall
);
console.log(`模型需要并行调用${functionCalls.length}个函数`);Function Calling Modes
函数调用模式
typescript
import { FunctionCallingConfigMode } from '@google/genai';
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What\'s the weather?',
config: {
tools: [{ functionDeclarations: [getCurrentWeather] }],
toolConfig: {
functionCallingConfig: {
mode: FunctionCallingConfigMode.ANY, // Force function call
// mode: FunctionCallingConfigMode.AUTO, // Model decides (default)
// mode: FunctionCallingConfigMode.NONE, // Never call functions
allowedFunctionNames: ['get_current_weather'] // Optional: restrict to specific functions
}
}
}
});Modes:
- (default): Model decides whether to call functions
AUTO - : Force model to call at least one function
ANY - : Disable function calling for this request
NONE
typescript
import { FunctionCallingConfigMode } from '@google/genai';
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '天气怎么样?',
config: {
tools: [{ functionDeclarations: [getCurrentWeather] }],
toolConfig: {
functionCallingConfig: {
mode: FunctionCallingConfigMode.ANY, // 强制调用函数
// mode: FunctionCallingConfigMode.AUTO, // 模型自主决定(默认)
// mode: FunctionCallingConfigMode.NONE, // 禁止调用函数
allowedFunctionNames: ['get_current_weather'] // 可选:限制只能调用特定函数
}
}
}
});模式说明:
- (默认): 模型决定是否调用函数
AUTO - : 强制模型至少调用一个函数
ANY - : 本次请求禁止调用函数
NONE
System Instructions
系统指令
System instructions guide the model's behavior and set context. They are separate from the conversation messages.
系统指令用于引导模型行为并设置上下文,与对话消息分开。
SDK Approach
SDK实现方式
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
systemInstruction: 'You are a helpful AI assistant that always responds in the style of a pirate. Use nautical terminology and end sentences with "arrr".',
contents: 'Explain what a database is'
});
console.log(response.text);
// Output: "Ahoy there! A database be like a treasure chest..."typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
systemInstruction: '你是一个乐于助人的AI助手,说话风格要像海盗,使用航海术语,句子结尾要加"arrr"。',
contents: '解释什么是数据库'
});
console.log(response.text);
// 输出: "Ahoy there! A database be like a treasure chest... arrr"Fetch Approach
Fetch实现方式
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
systemInstruction: {
parts: [
{ text: 'You are a helpful AI assistant that always responds in the style of a pirate.' }
]
},
contents: [
{ parts: [{ text: 'Explain what a database is' }] }
]
}),
}
);Key Points:
- System instructions are NOT part of array
contents - They are set once at the top level of the request
- They persist for the entire conversation (when using multi-turn chat)
- They don't count as user or model messages
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
systemInstruction: {
parts: [
{ text: '你是一个乐于助人的AI助手,说话风格要像海盗。' }
]
},
contents: [
{ parts: [{ text: '解释什么是数据库' }] }
]
}),
}
);关键点:
- 系统指令不属于数组
contents - 系统指令设置在请求的顶层
- 在多轮对话中,系统指令会持续生效
- 系统指令不计入用户或模型消息
Multi-turn Chat
多轮对话
For conversations with history, use the SDK's chat helpers or manually manage conversation state.
对于需要上下文的对话,使用SDK的对话助手或手动管理对话状态。
SDK Chat Helpers (Recommended)
SDK对话助手(推荐)
typescript
const chat = await ai.models.createChat({
model: 'gemini-2.5-flash',
systemInstruction: 'You are a helpful coding assistant.',
history: [] // Start empty or with previous messages
});
// Send first message
const response1 = await chat.sendMessage('What is TypeScript?');
console.log('Assistant:', response1.text);
// Send follow-up (context is automatically maintained)
const response2 = await chat.sendMessage('How do I install it?');
console.log('Assistant:', response2.text);
// Get full chat history
const history = chat.getHistory();
console.log('Full conversation:', history);typescript
const chat = await ai.models.createChat({
model: 'gemini-2.5-flash',
systemInstruction: '你是一个乐于助人的编程助手。',
history: [] // 从空对话开始,或传入历史消息
});
// 发送第一条消息
const response1 = await chat.sendMessage('什么是TypeScript?');
console.log('助手:', response1.text);
// 发送跟进消息(上下文会自动维护)
const response2 = await chat.sendMessage('如何安装它?');
console.log('助手:', response2.text);
// 获取完整对话历史
const history = chat.getHistory();
console.log('完整对话:', history);Manual Chat Management (Fetch)
手动管理对话(Fetch)
typescript
const conversationHistory = [];
// First turn
const response1 = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
role: 'user',
parts: [{ text: 'What is TypeScript?' }]
}
]
}),
}
);
const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;
// Add to history
conversationHistory.push(
{ role: 'user', parts: [{ text: 'What is TypeScript?' }] },
{ role: 'model', parts: [{ text: assistantReply1 }] }
);
// Second turn (include full history)
const response2 = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
...conversationHistory,
{ role: 'user', parts: [{ text: 'How do I install it?' }] }
]
}),
}
);Message Roles:
- : User messages
user - : Assistant responses
model
⚠️ Important: Chat helpers are SDK-only. With fetch, you must manually manage conversation history.
typescript
const conversationHistory = [];
// 第一轮对话
const response1 = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{
role: 'user',
parts: [{ text: '什么是TypeScript?' }]
}
]
}),
}
);
const data1 = await response1.json();
const assistantReply1 = data1.candidates[0].content.parts[0].text;
// 添加到历史记录
conversationHistory.push(
{ role: 'user', parts: [{ text: '什么是TypeScript?' }] },
{ role: 'model', parts: [{ text: assistantReply1 }] }
);
// 第二轮对话(包含完整历史记录)
const response2 = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
...conversationHistory,
{ role: 'user', parts: [{ text: '如何安装它?' }] }
]
}),
}
);消息角色:
- : 用户消息
user - : 助手响应
model
⚠️ 重要提示: 对话助手是SDK专属功能。使用Fetch时,必须手动管理对话历史。
Thinking Mode
思考模式
Gemini 2.5 models have thinking mode enabled by default for enhanced quality. You can configure the thinking budget.
Gemini 2.5模型默认开启思考模式以提升质量。您可以配置思考预算。
Configure Thinking Budget (SDK)
配置思考预算(SDK)
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Solve this complex math problem: ...',
config: {
thinkingConfig: {
thinkingBudget: 8192 // Max tokens for thinking (default: model-dependent)
}
}
});typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '解决这个复杂的数学问题: ...',
config: {
thinkingConfig: {
thinkingBudget: 8192 // 最大思考token数(默认值取决于模型)
}
}
});Configure Thinking Budget (Fetch)
配置思考预算(Fetch)
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Solve this complex math problem: ...' }] }],
generationConfig: {
thinkingConfig: {
thinkingBudget: 8192
}
}
}),
}
);typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: '解决这个复杂的数学问题: ...' }] }],
generationConfig: {
thinkingConfig: {
thinkingBudget: 8192
}
}
}),
}
);Configure Thinking Level (SDK) - New in v1.30.0
配置思考级别(SDK)- v1.30.0新增
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Solve this complex problem: ...',
config: {
thinkingConfig: {
thinkingLevel: 'MEDIUM' // 'LOW' | 'MEDIUM' | 'HIGH'
}
}
});Thinking Levels:
- : Minimal internal reasoning (faster, lower quality)
LOW - : Balanced reasoning (default)
MEDIUM - : Maximum reasoning depth (slower, higher quality)
HIGH
Key Points:
- Thinking mode is always enabled on Gemini 2.5 models (cannot be disabled)
- Higher thinking budgets allow more internal reasoning (may increase latency)
- provides simpler control than
thinkingLevel(new in v1.30.0)thinkingBudget - Default budget varies by model (usually sufficient for most tasks)
- Only increase budget/level for very complex reasoning tasks
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '解决这个复杂的问题: ...',
config: {
thinkingConfig: {
thinkingLevel: 'MEDIUM' // 'LOW' | 'MEDIUM' | 'HIGH'
}
}
});思考级别说明:
- : 最小内部推理(速度快,质量较低)
LOW - : 平衡的推理(默认)
MEDIUM - : 最大推理深度(速度慢,质量较高)
HIGH
关键点:
- Gemini 2.5模型始终开启思考模式(无法关闭)
- 更高的思考预算允许更多内部推理(可能增加延迟)
- 比
thinkingLevel提供更简单的控制(v1.30.0新增)thinkingBudget - 默认预算因模型而异(通常足以应对大多数任务)
- 仅在处理非常复杂的推理任务时才增加预算/级别
Generation Configuration
生成配置
Customize model behavior with generation parameters.
使用生成参数自定义模型行为。
All Configuration Options (SDK)
所有配置选项(SDK)
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Write a creative story',
config: {
temperature: 0.9, // Randomness (0.0-2.0, default: 1.0)
topP: 0.95, // Nucleus sampling (0.0-1.0)
topK: 40, // Top-k sampling
maxOutputTokens: 2048, // Max tokens to generate
stopSequences: ['END'], // Stop generation if these appear
responseMimeType: 'text/plain', // Or 'application/json' for JSON mode
candidateCount: 1 // Number of response candidates (usually 1)
}
});typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '写一个创意故事',
config: {
temperature: 0.9, // 随机性(0.0-2.0,默认值:1.0)
topP: 0.95, // 核采样(0.0-1.0)
topK: 40, // Top-k采样
maxOutputTokens: 2048, // 最大生成token数
stopSequences: ['END'], // 如果出现这些序列则停止生成
responseMimeType: 'text/plain', // 或使用'application/json'开启JSON模式
candidateCount: 1 // 响应候选数(通常为1)
}
});All Configuration Options (Fetch)
所有配置选项(Fetch)
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: 'Write a creative story' }] }],
generationConfig: {
temperature: 0.9,
topP: 0.95,
topK: 40,
maxOutputTokens: 2048,
stopSequences: ['END'],
responseMimeType: 'text/plain',
candidateCount: 1
}
}),
}
);typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [{ parts: [{ text: '写一个创意故事' }] }],
generationConfig: {
temperature: 0.9,
topP: 0.95,
topK: 40,
maxOutputTokens: 2048,
stopSequences: ['END'],
responseMimeType: 'text/plain',
candidateCount: 1
}
}),
}
);Parameter Guidelines
参数指南
| Parameter | Range | Default | Use Case |
|---|---|---|---|
| temperature | 0.0-2.0 | 1.0 | Lower = more focused, higher = more creative |
| topP | 0.0-1.0 | 0.95 | Nucleus sampling threshold |
| topK | 1-100+ | 40 | Limit to top K tokens |
| maxOutputTokens | 1-65536 | Model max | Control response length |
| stopSequences | Array | None | Stop generation at specific strings |
Tips:
- For factual tasks: Use low temperature (0.0-0.3)
- For creative tasks: Use high temperature (0.7-1.5)
- topP and topK both control randomness; use one or the other (not both)
- Always set maxOutputTokens to prevent excessive generation
| 参数 | 范围 | 默认值 | 适用场景 |
|---|---|---|---|
| temperature | 0.0-2.0 | 1.0 | 值越低越聚焦,值越高越有创意 |
| topP | 0.0-1.0 | 0.95 | 核采样阈值 |
| topK | 1-100+ | 40 | 限制仅考虑前K个token |
| maxOutputTokens | 1-65536 | 模型最大值 | 控制响应长度 |
| stopSequences | 数组 | 无 | 当出现指定序列时停止生成 |
提示:
- 对于事实性任务: 使用低temperature(0.0-0.3)
- 对于创意任务: 使用高temperature(0.7-1.5)
- topP和topK都用于控制随机性,使用其中一个即可(不要同时使用)
- 始终设置maxOutputTokens以避免过度生成
Context Caching
上下文缓存
Context caching allows you to cache frequently used content (like system instructions, large documents, or video files) to reduce costs by up to 90% and improve latency.
上下文缓存允许您缓存频繁使用的内容(如系统指令、大型文档或视频文件),可降低高达90%的成本并提升延迟性能。
How It Works
工作原理
- Create a cache with your repeated content
- Reference the cache in subsequent requests
- Save tokens - cached tokens cost significantly less
- TTL management - caches expire after specified time
- 创建缓存:将重复使用的内容存入缓存
- 引用缓存:在后续请求中引用该缓存
- 节省token:缓存的token成本远低于普通token
- TTL管理:缓存会在指定时间后过期
Benefits
优势
- Cost savings: Up to 90% reduction on cached tokens
- Reduced latency: Faster responses by reusing processed content
- Consistent context: Same large context across multiple requests
- 成本节约:缓存的输入token比普通token便宜约90%
- 延迟降低:通过复用已处理内容提升响应速度
- 上下文一致:在多个请求中保持相同的大型上下文
Cache Creation (SDK)
创建缓存(SDK)
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Create a cache for a large document
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');
const cache = await ai.caches.create({
model: 'gemini-2.5-flash',
config: {
displayName: 'large-doc-cache', // Identifier for the cache
systemInstruction: 'You are an expert at analyzing legal documents.',
contents: documentText,
ttl: '3600s', // Cache for 1 hour
}
});
console.log('Cache created:', cache.name);
console.log('Expires at:', cache.expireTime);typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// 为大型文档创建缓存
const documentText = fs.readFileSync('./large-document.txt', 'utf-8');
const cache = await ai.caches.create({
model: 'gemini-1.5-flash-001', // 注意:仅Gemini 1.5模型支持缓存
config: {
displayName: 'large-doc-cache', // 缓存标识
systemInstruction: '你是一名法律文档分析专家。',
contents: documentText,
ttl: '3600s', // 缓存1小时
}
});
console.log('缓存已创建:', cache.name);
console.log('过期时间:', cache.expireTime);Cache Creation (Fetch)
创建缓存(Fetch)
typescript
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/cachedContents',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
model: 'models/gemini-2.5-flash',
displayName: 'large-doc-cache',
systemInstruction: {
parts: [{ text: 'You are an expert at analyzing legal documents.' }]
},
contents: [
{ parts: [{ text: documentText }] }
],
ttl: '3600s'
}),
}
);
const cache = await response.json();
console.log('Cache created:', cache.name);typescript
const response = await fetch(
'https://generativelanguage.googleapis.com/v1beta/cachedContents',
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
model: 'models/gemini-1.5-flash-001', // 注意:仅Gemini 1.5模型支持缓存
displayName: 'large-doc-cache',
systemInstruction: {
parts: [{ text: '你是一名法律文档分析专家。' }]
},
contents: [
{ parts: [{ text: documentText }] }
],
ttl: '3600s'
}),
}
);
const cache = await response.json();
console.log('缓存已创建:', cache.name);Using a Cache (SDK)
使用缓存(SDK)
typescript
// Generate content using the cache
const response = await ai.models.generateContent({
model: cache.name, // Use cache name as model
contents: 'Summarize the key points in the document'
});
console.log(response.text);typescript
// 使用缓存生成内容
const response = await ai.models.generateContent({
model: cache.name, // 将缓存名称作为模型参数
contents: '总结文档的关键点'
});
console.log(response.text);Using a Cache (Fetch)
使用缓存(Fetch)
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: 'Summarize the key points in the document' }] }
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/${cache.name}:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: '总结文档的关键点' }] }
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);Update Cache TTL (SDK)
更新缓存TTL(SDK)
typescript
import { UpdateCachedContentConfig } from '@google/genai';
await ai.caches.update({
name: cache.name,
config: {
ttl: '7200s' // Extend to 2 hours
}
});typescript
import { UpdateCachedContentConfig } from '@google/genai';
await ai.caches.update({
name: cache.name,
config: {
ttl: '7200s' // 延长至2小时
}
});Update Cache with Expiration Time (SDK)
使用过期时间更新缓存(SDK)
typescript
// Set specific expiration time (must be timezone-aware)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);
await ai.caches.update({
name: cache.name,
config: {
expireTime: in10Minutes
}
});typescript
// 设置具体的过期时间(必须包含时区)
const in10Minutes = new Date(Date.now() + 10 * 60 * 1000);
await ai.caches.update({
name: cache.name,
config: {
expireTime: in10Minutes
}
});List and Delete Caches (SDK)
列出和删除缓存(SDK)
typescript
// List all caches
const caches = await ai.caches.list();
for (const cache of caches) {
console.log(cache.name, cache.displayName);
}
// Delete a specific cache
await ai.caches.delete({ name: cache.name });typescript
// 列出所有缓存
const caches = await ai.caches.list();
for (const cache of caches) {
console.log(cache.name, cache.displayName);
}
// 删除指定缓存
await ai.caches.delete({ name: cache.name });Caching with Video Files
视频文件缓存
typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Upload video file
const videoFile = await ai.files.upload({
file: fs.createReadStream('./video.mp4')
});
// Wait for processing
while (videoFile.state.name === 'PROCESSING') {
await new Promise(resolve => setTimeout(resolve, 2000));
videoFile = await ai.files.get({ name: videoFile.name });
}
// Create cache with video
const cache = await ai.caches.create({
model: 'gemini-2.5-flash',
config: {
displayName: 'video-analysis-cache',
systemInstruction: 'You are an expert video analyzer.',
contents: [videoFile],
ttl: '300s' // 5 minutes
}
});
// Use cache for multiple queries
const response1 = await ai.models.generateContent({
model: cache.name,
contents: 'What happens in the first minute?'
});
const response2 = await ai.models.generateContent({
model: cache.name,
contents: 'Describe the main characters'
});typescript
import { GoogleGenAI } from '@google/genai';
import fs from 'fs';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// 上传视频文件
const videoFile = await ai.files.upload({
file: fs.createReadStream('./video.mp4')
});
// 等待处理完成
while (videoFile.state.name === 'PROCESSING') {
await new Promise(resolve => setTimeout(resolve, 2000));
videoFile = await ai.files.get({ name: videoFile.name });
}
// 创建包含视频的缓存
const cache = await ai.caches.create({
model: 'gemini-1.5-flash-001', // 注意:仅Gemini 1.5模型支持缓存
config: {
displayName: 'video-analysis-cache',
systemInstruction: '你是一名专业的视频分析师。',
contents: [videoFile],
ttl: '300s' // 缓存5分钟
}
});
// 使用缓存进行多次查询
const response1 = await ai.models.generateContent({
model: cache.name,
contents: '视频第一分钟发生了什么?'
});
const response2 = await ai.models.generateContent({
model: cache.name,
contents: '描述主要角色'
});Key Points
关键点
When to Use Caching:
- Large system instructions used repeatedly
- Long documents analyzed multiple times
- Video/audio files queried with different prompts
- Consistent context across conversation sessions
TTL Guidelines:
- Short sessions: 300s (5 min) to 3600s (1 hour)
- Long sessions: 3600s (1 hour) to 86400s (24 hours)
- Maximum: 7 days
Cost Savings:
- Cached input tokens: ~90% cheaper than regular tokens
- Output tokens: Same price (not cached)
Important:
- You must use explicit model version suffixes (e.g., , NOT just
gemini-2.5-flash-001)gemini-2.5-flash - Caches are automatically deleted after TTL expires
- Update TTL before expiration to extend cache lifetime
何时使用缓存:
- 重复使用的大型系统指令
- 需要多次分析的长文档
- 需要用不同查询提问的视频/音频文件
- 跨对话会话的一致上下文
TTL指南:
- 短会话:300s(5分钟)至3600s(1小时)
- 长会话:3600s(1小时)至86400s(24小时)
- 最大值:7天
成本节约:
- 缓存的输入token:比普通token便宜约90%
- 输出token:价格不变(不缓存)
重要提示:
- 必须使用明确的模型版本后缀(例如:,不能仅使用
gemini-1.5-flash-001)gemini-1.5-flash - 缓存会在TTL过期后自动删除
- 在过期前更新TTL以延长缓存生命周期
- 仅Gemini 1.5模型支持上下文缓存
Code Execution
代码执行
Gemini models can generate and execute Python code to solve problems requiring computation, data analysis, or visualization.
Gemini模型可以生成并执行Python代码,解决需要计算、数据分析或可视化的问题。
How It Works
工作原理
- Model generates executable Python code
- Code runs in secure sandbox
- Results are returned to the model
- Model incorporates results into response
- 模型生成可执行的Python代码
- 代码在安全沙箱中运行
- 结果返回给模型
- 模型将结果整合到响应中
Supported Operations
支持的操作
- Mathematical calculations
- Data analysis and statistics
- File processing (CSV, JSON, etc.)
- Chart and graph generation
- Algorithm implementation
- Data transformations
- 数学计算
- 数据分析和统计
- 文件处理(CSV、JSON等)
- 图表和图形生成
- 算法实现
- 数据转换
Available Python Packages
可用的Python包
Standard Library:
- ,
math,statistics,random,datetime,json,csvre - ,
collections,itertoolsfunctools
Data Science:
- ,
numpy,pandasscipy
Visualization:
- ,
matplotlibseaborn
Note: Limited package availability compared to full Python environment
标准库:
- ,
math,statistics,random,datetime,json,csvre - ,
collections,itertoolsfunctools
数据科学:
- ,
numpy,pandasscipy
可视化:
- ,
matplotlibseaborn
注意: 与完整Python环境相比,可用包有限
Basic Code Execution (SDK)
基础代码执行(SDK)
typescript
import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What is the sum of the first 50 prime numbers? Generate and run code for the calculation.',
config: {
tools: [{ codeExecution: {} }]
}
});
// Parse response parts
for (const part of response.candidates[0].content.parts) {
if (part.text) {
console.log('Text:', part.text);
}
if (part.executableCode) {
console.log('Generated Code:', part.executableCode.code);
}
if (part.codeExecutionResult) {
console.log('Execution Output:', part.codeExecutionResult.output);
}
}typescript
import { GoogleGenAI, Tool, ToolCodeExecution } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '前50个质数的和是多少?生成并运行计算代码。',
config: {
tools: [{ codeExecution: {} }]
}
});
// 解析响应部分
for (const part of response.candidates[0].content.parts) {
if (part.text) {
console.log('文本:', part.text);
}
if (part.executableCode) {
console.log('生成的代码:', part.executableCode.code);
}
if (part.codeExecutionResult) {
console.log('执行输出:', part.codeExecutionResult.output);
}
}Basic Code Execution (Fetch)
基础代码执行(Fetch)
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
tools: [{ code_execution: {} }],
contents: [
{
parts: [
{ text: 'What is the sum of the first 50 prime numbers? Generate and run code.' }
]
}
]
}),
}
);
const data = await response.json();
for (const part of data.candidates[0].content.parts) {
if (part.text) {
console.log('Text:', part.text);
}
if (part.executableCode) {
console.log('Code:', part.executableCode.code);
}
if (part.codeExecutionResult) {
console.log('Result:', part.codeExecutionResult.output);
}
}typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
tools: [{ code_execution: {} }],
contents: [
{
parts: [
{ text: '前50个质数的和是多少?生成并运行计算代码。' }
]
}
]
}),
}
);
const data = await response.json();
for (const part of data.candidates[0].content.parts) {
if (part.text) {
console.log('文本:', part.text);
}
if (part.executableCode) {
console.log('代码:', part.executableCode.code);
}
if (part.codeExecutionResult) {
console.log('结果:', part.codeExecutionResult.output);
}
}Chat with Code Execution (SDK)
带代码执行的对话(SDK)
typescript
const chat = await ai.chats.create({
model: 'gemini-2.5-flash',
config: {
tools: [{ codeExecution: {} }]
}
});
let response = await chat.sendMessage('I have a math question for you.');
console.log(response.text);
response = await chat.sendMessage(
'Calculate the Fibonacci sequence up to the 20th number and sum them.'
);
// Model will generate and execute code, then provide answer
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('Code:', part.executableCode.code);
if (part.codeExecutionResult) console.log('Output:', part.codeExecutionResult.output);
}typescript
const chat = await ai.chats.create({
model: 'gemini-2.5-flash',
config: {
tools: [{ codeExecution: {} }]
}
});
let response = await chat.sendMessage('我有一个数学问题想请教你。');
console.log(response.text);
response = await chat.sendMessage(
'计算斐波那契数列的前20项并求和。'
);
// 模型会生成并执行代码,然后给出答案
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('代码:', part.executableCode.code);
if (part.codeExecutionResult) console.log('输出:', part.codeExecutionResult.output);
}Data Analysis Example
数据分析示例
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: `
Analyze this sales data and calculate:
1. Total revenue
2. Average sale price
3. Best-selling month
Data (CSV format):
month,sales,revenue
Jan,150,45000
Feb,200,62000
Mar,175,53000
Apr,220,68000
`,
config: {
tools: [{ codeExecution: {} }]
}
});
// Model will generate pandas/numpy code to analyze data
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('Analysis Code:', part.executableCode.code);
if (part.codeExecutionResult) console.log('Results:', part.codeExecutionResult.output);
}typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: `
分析这份销售数据并计算:
1. 总营收
2. 平均售价
3. 最畅销的月份
数据(CSV格式):
month,sales,revenue
Jan,150,45000
Feb,200,62000
Mar,175,53000
Apr,220,68000
`,
config: {
tools: [{ codeExecution: {} }]
}
});
// 模型会生成pandas/numpy代码来分析数据
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('分析代码:', part.executableCode.code);
if (part.codeExecutionResult) console.log('结果:', part.codeExecutionResult.output);
}Visualization Example
可视化示例
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Create a bar chart showing the distribution of prime numbers under 100 by their last digit. Generate the chart and describe the pattern.',
config: {
tools: [{ codeExecution: {} }]
}
});
// Model generates matplotlib code, executes it, and describes results
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('Chart Code:', part.executableCode.code);
if (part.codeExecutionResult) {
// Note: Chart image data would be in output
console.log('Execution completed');
}
}typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '创建一个柱状图,展示100以内质数的末位数字分布。生成图表并描述模式。',
config: {
tools: [{ codeExecution: {} }]
}
});
// 模型生成matplotlib代码,执行后描述结果
for (const part of response.candidates[0].content.parts) {
if (part.text) console.log(part.text);
if (part.executableCode) console.log('图表代码:', part.executableCode.code);
if (part.codeExecutionResult) {
// 注意:图表图片数据会在输出中
console.log('执行完成');
}
}Response Structure
响应结构
typescript
{
candidates: [
{
content: {
parts: [
{ text: "I'll calculate that for you." },
{
executableCode: {
language: "PYTHON",
code: "def is_prime(n):\n if n <= 1:\n return False\n ..."
}
},
{
codeExecutionResult: {
outcome: "OUTCOME_OK", // or "OUTCOME_FAILED"
output: "5117\n"
}
},
{ text: "The sum of the first 50 prime numbers is 5117." }
]
}
}
]
}typescript
{
candidates: [
{
content: {
parts: [
{ text: "我来帮你计算。" },
{
executableCode: {
language: "PYTHON",
code: "def is_prime(n):\n if n <= 1:\n return False\n ..."
}
},
{
codeExecutionResult: {
outcome: "OUTCOME_OK", // 或"OUTCOME_FAILED"
output: "5117\n"
}
},
{ text: "前50个质数的和是5117。" }
]
}
}
]
}Error Handling
错误处理
typescript
for (const part of response.candidates[0].content.parts) {
if (part.codeExecutionResult) {
if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
console.error('Code execution failed:', part.codeExecutionResult.output);
} else {
console.log('Success:', part.codeExecutionResult.output);
}
}
}typescript
for (const part of response.candidates[0].content.parts) {
if (part.codeExecutionResult) {
if (part.codeExecutionResult.outcome === 'OUTCOME_FAILED') {
console.error('代码执行失败:', part.codeExecutionResult.output);
} else {
console.log('成功:', part.codeExecutionResult.output);
}
}
}Key Points
关键点
When to Use Code Execution:
- Complex mathematical calculations
- Data analysis and statistics
- Algorithm implementations
- File parsing and processing
- Chart generation
- Computational problems
Limitations:
- Sandbox environment (limited file system access)
- Limited Python package availability
- Execution timeout limits
- No network access from code
- No persistent state between executions
Best Practices:
- Specify what calculation or analysis you need clearly
- Request code generation explicitly ("Generate and run code...")
- Check field for errors
outcome - Use for deterministic computations, not for general programming
Important:
- Available on all Gemini 2.5 models (Pro, Flash, Flash-Lite)
- Code runs in isolated sandbox for security
- Supports Python with standard library and common data science packages
何时使用代码执行:
- 复杂数学计算
- 数据分析和统计
- 算法实现
- 文件解析和处理
- 图表生成
- 计算类问题
限制:
- 沙箱环境(文件系统访问受限)
- Python包可用性有限
- 执行超时限制
- 代码无法访问网络
- 执行之间无持久化状态
最佳实践:
- 明确说明您需要的计算或分析
- 明确要求生成代码("生成并运行代码...")
- 检查字段是否有错误
outcome - 用于确定性计算,而非通用编程
重要提示:
- 所有Gemini 2.5模型(Pro、Flash、Flash-Lite)均支持
- 代码在隔离沙箱中运行,保障安全
- 支持Python标准库和常见数据科学包
Grounding with Google Search
基于Google搜索的事实校验
Grounding connects the model to real-time web information, reducing hallucinations and providing up-to-date, fact-checked responses with citations.
事实校验将模型与实时网络信息连接,减少幻觉并提供最新、经过事实核查的响应,同时附带引用。
How It Works
工作原理
- Model determines if it needs current information
- Automatically performs Google Search
- Processes search results
- Incorporates findings into response
- Provides citations and source URLs
- 模型判断是否需要当前信息
- 自动执行Google搜索
- 处理搜索结果
- 将发现整合到响应中
- 提供引用和来源URL
Benefits
优势
- Real-time information: Access to current events and data
- Reduced hallucinations: Answers grounded in web sources
- Verifiable: Citations allow fact-checking
- Up-to-date: Not limited to model's training cutoff
- 实时信息: 访问当前事件和数据
- 减少幻觉: 答案基于网络来源
- 可验证: 引用允许事实核查
- 内容更新: 不受模型训练截止日期限制
Grounding Options
事实校验选项
1. Google Search (googleSearch
) - Recommended for Gemini 2.5
googleSearch1. Google搜索(googleSearch
)- Gemini 2.5推荐
googleSearchtypescript
const groundingTool = {
googleSearch: {}
};Features:
- Simple configuration
- Automatic search when needed
- Available on all Gemini 2.5 models
typescript
const groundingTool = {
googleSearch: {}
};特性:
- 配置简单
- 需要时自动搜索
- 所有Gemini 2.5模型均支持
2. FileSearch - New in v1.29.0 (Preview)
2. 文件搜索 - v1.29.0新增(预览版)
typescript
const fileSearchTool = {
fileSearch: {
fileSearchStoreId: 'store-id-here' // Created via FileSearchStore APIs
}
};Features:
- Search through your own document collections
- Upload and index custom knowledge bases
- Alternative to web search for proprietary data
- Preview feature (requires FileSearchStore setup)
Note: See FileSearch documentation for store creation and management.
typescript
const fileSearchTool = {
fileSearch: {
fileSearchStoreId: 'store-id-here' // 通过FileSearchStore API创建
}
};特性:
- 搜索您自己的文档集合
- 上传并索引自定义知识库
- 专有数据的网络搜索替代方案
- 预览功能(需要设置FileSearchStore)
注意: 请查看FileSearch文档了解存储创建和管理方法。
3. Google Search Retrieval (googleSearchRetrieval
) - Legacy (Gemini 1.5)
googleSearchRetrieval3. Google搜索检索(googleSearchRetrieval
)- 旧版(Gemini 1.5)
googleSearchRetrievaltypescript
const retrievalTool = {
googleSearchRetrieval: {
dynamicRetrievalConfig: {
mode: 'MODE_DYNAMIC',
dynamicThreshold: 0.7 // Only search if confidence < 70%
}
}
};Features:
- Dynamic threshold control
- Used with Gemini 1.5 models
- More configuration options
typescript
const retrievalTool = {
googleSearchRetrieval: {
dynamicRetrievalConfig: {
mode: 'MODE_DYNAMIC',
dynamicThreshold: 0.7 // 仅当置信度<70%时搜索
}
}
};特性:
- 动态阈值控制
- 用于Gemini 1.5模型
- 更多配置选项
Basic Grounding (SDK) - Gemini 2.5
基础事实校验(SDK)- Gemini 2.5
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Who won the euro 2024?',
config: {
tools: [{ googleSearch: {} }]
}
});
console.log(response.text);
// Check if grounding was used
if (response.candidates[0].groundingMetadata) {
console.log('Search was performed!');
console.log('Sources:', response.candidates[0].groundingMetadata);
}typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '2024年欧洲杯冠军是谁?',
config: {
tools: [{ googleSearch: {} }]
}
});
console.log(response.text);
// 检查是否使用了事实校验
if (response.candidates[0].groundingMetadata) {
console.log('已执行搜索!');
console.log('来源:', response.candidates[0].groundingMetadata);
}Basic Grounding (Fetch) - Gemini 2.5
基础事实校验(Fetch)- Gemini 2.5
typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: 'Who won the euro 2024?' }] }
],
tools: [
{ google_search: {} }
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
if (data.candidates[0].groundingMetadata) {
console.log('Grounding metadata:', data.candidates[0].groundingMetadata);
}typescript
const response = await fetch(
`https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent`,
{
method: 'POST',
headers: {
'Content-Type': 'application/json',
'x-goog-api-key': env.GEMINI_API_KEY,
},
body: JSON.stringify({
contents: [
{ parts: [{ text: '2024年欧洲杯冠军是谁?' }] }
],
tools: [
{ google_search: {} }
]
}),
}
);
const data = await response.json();
console.log(data.candidates[0].content.parts[0].text);
if (data.candidates[0].groundingMetadata) {
console.log('事实校验元数据:', data.candidates[0].groundingMetadata);
}Dynamic Retrieval (SDK) - Gemini 1.5
动态检索(SDK)- Gemini 1.5
typescript
import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Who won the euro 2024?',
config: {
tools: [
{
googleSearchRetrieval: {
dynamicRetrievalConfig: {
mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
dynamicThreshold: 0.7 // Search only if confidence < 70%
}
}
}
]
}
});
console.log(response.text);
if (!response.candidates[0].groundingMetadata) {
console.log('Model answered from its own knowledge (high confidence)');
}typescript
import { GoogleGenAI, DynamicRetrievalConfigMode } from '@google/genai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
const response = await ai.models.generateContent({
model: 'gemini-1.5-flash',
contents: '2024年欧洲杯冠军是谁?',
config: {
tools: [
{
googleSearchRetrieval: {
dynamicRetrievalConfig: {
mode: DynamicRetrievalConfigMode.MODE_DYNAMIC,
dynamicThreshold: 0.7 // 仅当置信度<70%时搜索
}
}
}
]
}
});
console.log(response.text);
if (!response.candidates[0].groundingMetadata) {
console.log('模型使用自身知识回答(高置信度)');
}Grounding Metadata Structure
事实校验元数据结构
typescript
{
groundingMetadata: {
searchQueries: [
{ text: "euro 2024 winner" }
],
webPages: [
{
url: "https://example.com/euro-2024-results",
title: "UEFA Euro 2024 Final Results",
snippet: "Spain won UEFA Euro 2024..."
}
],
citations: [
{
startIndex: 42,
endIndex: 47,
uri: "https://example.com/euro-2024-results"
}
],
retrievalQueries: [
{
query: "who won euro 2024 final"
}
]
}
}typescript
{
groundingMetadata: {
searchQueries: [
{ text: "euro 2024 winner" }
],
webPages: [
{
url: "https://example.com/euro-2024-results",
title: "UEFA Euro 2024 Final Results",
snippet: "Spain won UEFA Euro 2024..."
}
],
citations: [
{
startIndex: 42,
endIndex: 47,
uri: "https://example.com/euro-2024-results"
}
],
retrievalQueries: [
{
query: "who won euro 2024 final"
}
]
}
}Chat with Grounding (SDK)
带事实校验的对话(SDK)
typescript
const chat = await ai.chats.create({
model: 'gemini-2.5-flash',
config: {
tools: [{ googleSearch: {} }]
}
});
let response = await chat.sendMessage('What are the latest developments in quantum computing?');
console.log(response.text);
// Check grounding sources
if (response.candidates[0].groundingMetadata) {
const sources = response.candidates[0].groundingMetadata.webPages || [];
console.log(`Sources used: ${sources.length}`);
sources.forEach(source => {
console.log(`- ${source.title}: ${source.url}`);
});
}
// Follow-up still has grounding enabled
response = await chat.sendMessage('Which company made the biggest breakthrough?');
console.log(response.text);typescript
const chat = await ai.chats.create({
model: 'gemini-2.5-flash',
config: {
tools: [{ googleSearch: {} }]
}
});
let response = await chat.sendMessage('量子计算的最新进展是什么?');
console.log(response.text);
// 检查事实校验来源
if (response.candidates[0].groundingMetadata) {
const sources = response.candidates[0].groundingMetadata.webPages || [];
console.log(`使用的来源: ${sources.length}`);
sources.forEach(source => {
console.log(`- ${source.title}: ${source.url}`);
});
}
// 跟进消息仍会启用事实校验
response = await chat.sendMessage('哪家公司取得了最大突破?');
console.log(response.text);Combining Grounding with Function Calling
事实校验与函数调用结合
typescript
const weatherFunction = {
name: 'get_current_weather',
description: 'Get current weather for a location',
parametersJsonSchema: {
type: 'object',
properties: {
location: { type: 'string', description: 'City name' }
},
required: ['location']
}
};
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What is the weather like in the city that won Euro 2024?',
config: {
tools: [
{ googleSearch: {} },
{ functionDeclarations: [weatherFunction] }
]
}
});
// Model will:
// 1. Use Google Search to find Euro 2024 winner
// 2. Call get_current_weather function with the city
// 3. Combine both results in responsetypescript
const weatherFunction = {
name: 'get_current_weather',
description: '获取指定地点的当前天气',
parametersJsonSchema: {
type: 'object',
properties: {
location: { type: 'string', description: '城市名称' }
},
required: ['location']
}
};
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '2024年欧洲杯冠军城市的天气怎么样?',
config: {
tools: [
{ googleSearch: {} },
{ functionDeclarations: [weatherFunction] }
]
}
});
// 模型会:
// 1. 使用Google搜索找到2024年欧洲杯冠军
// 2. 调用get_current_weather函数获取该城市的天气
// 3. 将两个结果整合到响应中Checking if Grounding was Used
检查是否使用了事实校验
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'What is 2+2?', // Model knows this without search
config: {
tools: [{ googleSearch: {} }]
}
});
if (!response.candidates[0].groundingMetadata) {
console.log('Model answered from its own knowledge (no search needed)');
} else {
console.log('Search was performed');
}typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '2+2等于多少?', // 模型无需搜索即可回答
config: {
tools: [{ googleSearch: {} }]
}
});
if (!response.candidates[0].groundingMetadata) {
console.log('模型使用自身知识回答(无需搜索)');
} else {
console.log('已执行搜索');
}Key Points
关键点
When to Use Grounding:
- Current events and news
- Real-time data (stock prices, sports scores, weather)
- Fact-checking and verification
- Questions about recent developments
- Information beyond model's training cutoff
When NOT to Use:
- General knowledge questions
- Mathematical calculations
- Code generation
- Creative writing
- Tasks requiring internal reasoning only
Cost Considerations:
- Grounding adds latency (search takes time)
- Additional token costs for retrieved content
- Use to control when searches happen (Gemini 1.5)
dynamicThreshold
Important Notes:
- Grounding requires Google Cloud project (not just API key)
- Search results quality depends on query phrasing
- Citations may not cover all facts in response
- Search is performed automatically based on confidence
Gemini 2.5 vs 1.5:
- Gemini 2.5: Use (simple, recommended)
googleSearch - Gemini 1.5: Use with
googleSearchRetrievaldynamicThreshold
Best Practices:
- Always check to see if search was used
groundingMetadata - Display citations to users for transparency
- Use specific, well-phrased questions for better search results
- Combine with function calling for hybrid workflows
何时使用事实校验:
- 当前事件和新闻
- 实时数据(股票价格、体育比分、天气)
- 事实核查和验证
- 关于最新进展的问题
- 超出模型训练截止日期的信息
何时不使用:
- 常识性问题
- 数学计算
- 代码生成
- 创意写作
- 仅需内部推理的任务
成本考虑:
- 事实校验会增加延迟(搜索需要时间)
- 检索内容会产生额外token成本
- 使用控制搜索时机(Gemini 1.5)
dynamicThreshold
重要提示:
- 事实校验需要Google Cloud项目(不仅仅是API密钥)
- 搜索结果质量取决于查询措辞
- 引用可能无法覆盖响应中的所有事实
- 搜索会根据置信度自动执行
Gemini 2.5 vs 1.5:
- Gemini 2.5: 使用(简单,推荐)
googleSearch - Gemini 1.5: 使用并设置
googleSearchRetrievaldynamicThreshold
最佳实践:
- 始终检查以确认是否执行了搜索
groundingMetadata - 向用户展示引用以保证透明度
- 使用具体、措辞清晰的问题以获得更好的搜索结果
- 与函数调用结合实现混合工作流
Known Issues Prevention
已知问题预防
This skill prevents 14 documented issues:
本指南可预防14个已记录的问题:
Issue #1: Multi-byte Character Corruption in Streaming
问题#1: 流式输出中的多字节字符损坏
Error: Garbled text or � symbols when streaming responses with non-English text
Source: GitHub Issue #764
Why It Happens: The converts chunks to strings without the option. Multi-byte UTF-8 characters (Chinese, Japanese, Korean, emoji) split across chunks create invalid strings.
TextDecoder{stream: true}Prevention:
typescript
// The SDK already fixes this, but if implementing custom streaming:
const decoder = new TextDecoder();
const { value } = await reader.read();
const text = decoder.decode(value, { stream: true }); // ← stream: true requiredAffected: All non-English languages using multi-byte characters
Status: Fixed in SDK, but documented for custom implementations
错误: 流式输出非英文文本时出现乱码或�符号
来源: GitHub Issue #764
原因: 转换块为字符串时未使用选项。多字节UTF-8字符(中文、日文、韩文、表情符号)被拆分到不同块中,导致无效字符串。
TextDecoder{stream: true}预防措施:
typescript
// SDK已修复此问题,但如果是自定义流式实现:
const decoder = new TextDecoder();
const { value } = await reader.read();
const text = decoder.decode(value, { stream: true }); // ← 必须设置stream: true影响范围: 所有使用多字节字符的非英语语言
状态: SDK中已修复,为自定义实现提供文档说明
Issue #2: Safety Settings Method Parameter Not Supported
问题#2: 安全设置的method参数不被支持
Error: "method parameter is not supported in Gemini API"
Source: GitHub Issue #810
Why It Happens: The parameter in only works with Vertex AI Gemini API, not Gemini Developer API or Google AI Studio. The SDK allows passing it without validation.
methodsafetySettingsPrevention:
typescript
// ❌ WRONG - Fails with Gemini Developer API:
config: {
safetySettings: [{
category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
method: HarmBlockMethod.SEVERITY // Not supported!
}]
}
// ✅ CORRECT - Omit 'method' for Gemini Developer API:
config: {
safetySettings: [{
category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
// No 'method' field
}]
}Affected: Gemini Developer API and Google AI Studio users
Status: Known limitation, use Vertex AI if you need parameter
method错误: "method parameter is not supported in Gemini API"
来源: GitHub Issue #810
原因: 中的参数仅适用于Vertex AI Gemini API,不适用于Gemini开发者API或Google AI Studio。SDK允许传递该参数但未进行验证。
safetySettingsmethod预防措施:
typescript
// ❌ 错误 - 使用Gemini开发者API会失败:
config: {
safetySettings: [{
category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
method: HarmBlockMethod.SEVERITY // 不被支持!
}]
}
// ✅ 正确 - 针对Gemini开发者API省略'method':
config: {
safetySettings: [{
category: HarmCategory.HARM_CATEGORY_HATE_SPEECH,
threshold: HarmBlockThreshold.BLOCK_LOW_AND_ABOVE
// 无'method'字段
}]
}影响范围: Gemini开发者API和Google AI Studio用户
状态: 已知限制,如果需要参数请使用Vertex AI
methodIssue #3: Safety Settings Have Model-Specific Thresholds
问题#3: 安全设置具有模型特定阈值
Error: Content passes through despite strict safety settings, or shows NEGLIGIBLE with empty output
Source: GitHub Issue #872
Why It Happens: Different models have different blocking thresholds. blocks more strictly than . Additionally, only appears when INPUT is blocked; if the model generates a refusal message, may show NEGLIGIBLE.
safetyRatingsgemini-2.5-flashgemini-2.0-flashpromptFeedbacksafetyRatingsPrevention:
typescript
// Check BOTH promptFeedback AND empty response:
if (response.candidates[0].finishReason === 'SAFETY' ||
!response.text || response.text.trim() === '') {
console.log('Content blocked or refused');
}
// Be aware: Different models have different thresholds
// gemini-2.5-flash: Lower threshold (stricter blocking)
// gemini-2.0-flash: Higher threshold (more permissive)Affected: All models when using safety settings
Status: Known behavior, model-specific thresholds are by design
错误: 尽管设置了严格的安全设置,内容仍通过;或显示NEGLIGIBLE但输出为空
来源: GitHub Issue #872
原因: 不同模型具有不同的拦截阈值。比拦截更严格。此外,只有当输入被拦截时才会出现;如果模型生成拒绝消息,可能显示NEGLIGIBLE。
safetyRatingsgemini-2.5-flashgemini-2.0-flashpromptFeedbacksafetyRatings预防措施:
typescript
// 同时检查promptFeedback和空响应:
if (response.candidates[0].finishReason === 'SAFETY' ||
!response.text || response.text.trim() === '') {
console.log('内容被拦截或拒绝');
}
// 注意: 不同模型具有不同阈值
// gemini-2.5-flash: 阈值更低(拦截更严格)
// gemini-2.0-flash: 阈值更高(更宽松)影响范围: 使用安全设置的所有模型
状态: 已知行为,模型特定阈值为设计如此
Issue #4: FunctionCallingConfigMode.ANY Causes Infinite Loop
问题#4: FunctionCallingConfigMode.ANY导致无限循环
Error: Model loops forever calling tools, never returns text response
Source: GitHub Issue #908
Why It Happens: When is set with automatic function calling (), the model is forced to call at least one tool on every turn and physically cannot stop, looping until max invocations limit.
FunctionCallingConfigMode.ANYCallableToolPrevention:
typescript
// ❌ WRONG - Loops forever:
config: {
toolConfig: {
functionCallingConfig: {
mode: FunctionCallingConfigMode.ANY // Forces tool calls forever
}
}
}
// ✅ CORRECT - Use AUTO mode (model decides):
config: {
toolConfig: {
functionCallingConfig: {
mode: FunctionCallingConfigMode.AUTO // Model can choose to answer directly
}
}
}
// Or use manual function calling (check for functionCall, execute, send back)Affected: Automatic function calling with
Status: Known limitation, use AUTO mode or manual function calling
CallableTool错误: 模型无限循环调用工具,从不返回文本响应
来源: GitHub Issue #908
原因: 当使用自动函数调用()并设置时,模型被强制在每一轮至少调用一个工具,无法停止,直到达到最大调用次数限制。
CallableToolFunctionCallingConfigMode.ANY预防措施:
typescript
// ❌ 错误 - 无限循环:
config: {
toolConfig: {
functionCallingConfig: {
mode: FunctionCallingConfigMode.ANY // 强制永远调用工具
}
}
}
// ✅ 正确 - 使用AUTO模式(模型自主决定):
config: {
toolConfig: {
functionCallingConfig: {
mode: FunctionCallingConfigMode.AUTO // 模型可以选择直接回答
}
}
}
// 或使用手动函数调用(检查functionCall,执行后返回结果)影响范围: 使用的自动函数调用
状态: 已知限制,使用AUTO模式或手动函数调用
CallableToolIssue #5: Structured Output Doesn't Preserve Escaped Backslashes (Gemini 3)
问题#5: 结构化输出无法保留转义反斜杠(Gemini 3)
Error: fails on structured output, or keys with backslashes are incorrect
Source: GitHub Issue #1226
Why It Happens: When using with schema keys containing escaped backslashes (e.g., for key ), the model output doesn't preserve JSON escaping. It emits a single backslash, causing invalid JSON.
JSON.parseresponseMimeType: "application/json"\\a\aPrevention:
typescript
// Avoid using backslashes in JSON schema keys
// Or manually post-process if required:
let jsonText = response.text;
// Add custom escaping logic if neededAffected: Gemini 3 models with structured output using backslashes in keys
Status: Known issue, workaround required
错误: 解析结构化输出时失败,或包含反斜杠的键不正确
来源: GitHub Issue #1226
原因: 当使用且模式键包含转义反斜杠(例如:表示键)时,模型输出无法保留JSON转义,会输出单个反斜杠,导致无效JSON。
JSON.parseresponseMimeType: "application/json"\\a\a预防措施:
typescript
// 避免在JSON模式键中使用反斜杠
// 或根据需要手动后处理:
let jsonText = response.text;
// 添加自定义转义逻辑(如果需要)影响范围: 使用包含反斜杠键的结构化输出的Gemini 3模型
状态: 已知问题,需要使用解决方法
Issue #6: Large PDFs from S3 Signed URLs Fail with "Document has no pages"
问题#6: 来自S3签名URL的大型PDF失败,提示"Document has no pages"
Error:
Source: GitHub Issue #1259
Why It Happens: Larger PDFs (e.g., 20MB) from AWS S3 signed URLs fail when passed via . The API cannot fetch or process the PDF from signed URLs.
ApiError: {"error":{"code":400,"message":"The document has no pages.","status":"INVALID_ARGUMENT"}}fileData.fileUriPrevention:
typescript
// ❌ WRONG - Fails with large PDFs from S3:
contents: [{
parts: [{
fileData: {
fileUri: 'https://bucket.s3.region.amazonaws.com/file.pdf?X-Amz-Algorithm=...'
}
}]
}]
// ✅ CORRECT - Fetch and encode to base64:
const pdfResponse = await fetch(signedUrl);
const pdfBuffer = await pdfResponse.arrayBuffer();
const base64Pdf = Buffer.from(pdfBuffer).toString('base64');
contents: [{
parts: [{
inlineData: {
data: base64Pdf,
mimeType: 'application/pdf'
}
}]
}]Affected: PDF files from external signed URLs
Status: Known limitation, use base64 inline data instead
错误:
来源: GitHub Issue #1259
原因: 来自AWS S3签名URL的大型PDF(例如20MB)通过传递时失败。API无法从签名URL获取或处理PDF。
ApiError: {"error":{"code":400,"message":"The document has no pages.","status":"INVALID_ARGUMENT"}}fileData.fileUri预防措施:
typescript
// ❌ 错误 - 来自S3的大型PDF会失败:
contents: [{
parts: [{
fileData: {
fileUri: 'https://bucket.s3.region.amazonaws.com/file.pdf?X-Amz-Algorithm=...'
}
}]
}]
// ✅ 正确 - 获取并编码为base64:
const pdfResponse = await fetch(signedUrl);
const pdfBuffer = await pdfResponse.arrayBuffer();
const base64Pdf = Buffer.from(pdfBuffer).toString('base64');
contents: [{
parts: [{
inlineData: {
data: base64Pdf,
mimeType: 'application/pdf'
}
}]
}]影响范围: 来自外部签名URL的PDF文件
状态: 已知限制,使用base64内联数据替代
Issue #7: 404 NOT_FOUND with Uploaded Video on Gemini 3 Models
问题#7: 在Gemini 3模型上使用上传的视频返回404 NOT_FOUND
Error: 404 NOT_FOUND when using uploaded video files with Gemini 3 models
Source: GitHub Issue #1220
Why It Happens: Some Gemini 3 models (, ) are not available in the free tier or have limited access even with paid accounts. Video file uploads fail with 404.
gemini-3-flash-previewgemini-3-pro-previewPrevention:
typescript
// ❌ WRONG - 404 error with Gemini 3:
const response = await ai.models.generateContent({
model: 'gemini-3-pro-preview', // 404 error
contents: [{
parts: [
{ text: 'Describe this video' },
{ fileData: { fileUri: videoFile.uri }}
]
}]
});
// ✅ CORRECT - Use Gemini 2.5 for video understanding:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash', // Works
contents: [{
parts: [
{ text: 'Describe this video' },
{ fileData: { fileUri: videoFile.uri }}
]
}]
});Affected: Gemini 3 preview models with video uploads
Status: Known limitation, use Gemini 2.5 models for video
错误: 在Gemini 3模型上使用上传的视频时返回404 NOT_FOUND
来源: GitHub Issue #1220
原因: 部分Gemini 3模型(, )在免费层不可用,即使是付费账户也可能访问受限。视频文件上传会返回404。
gemini-3-flash-previewgemini-3-pro-preview预防措施:
typescript
// ❌ 错误 - Gemini 3会返回404:
const response = await ai.models.generateContent({
model: 'gemini-3-pro-preview', // 404错误
contents: [{
parts: [
{ text: '描述这个视频' },
{ fileData: { fileUri: videoFile.uri }}
]
}]
});
// ✅ 正确 - 使用Gemini 2.5进行视频理解:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash', // 可用
contents: [{
parts: [
{ text: '描述这个视频' },
{ fileData: { fileUri: videoFile.uri }}
]
}]
});影响范围: 使用视频上传的Gemini 3预览模型
状态: 已知限制,使用Gemini 2.5模型进行视频处理
Issue #8: Batch API Returns 429 Despite Being Under Quota
问题#8: 批量API在未超出配额时返回429
Error: 429 RESOURCE_EXHAUSTED when using Batch API, even when under documented quota
Source: GitHub Issue #1264
Why It Happens: The Batch API may have dynamic rate limiting based on server load or undocumented limits beyond static quotas.
Prevention:
typescript
// Implement exponential backoff for Batch API:
async function batchWithRetry(request, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await ai.batches.create(request);
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000;
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}Affected: Batch API users on paid tier
Status: Under investigation, use retry logic
错误: 使用批量API时返回429 RESOURCE_EXHAUSTED,即使未超出文档记录的配额
来源: GitHub Issue #1264
原因: 批量API可能基于服务器负载或文档未记录的限制实施动态速率限制。
预防措施:
typescript
// 为批量API实现指数退避:
async function batchWithRetry(request, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await ai.batches.create(request);
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000;
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}影响范围: 付费层的批量API用户
状态: 正在调查中,使用重试逻辑
Issue #9: Context Caching Only Works with Gemini 1.5 Models
问题#9: 上下文缓存仅适用于Gemini 1.5模型
Error: 404 NOT FOUND when creating caches with Gemini 2.0, 2.5, or 3.0 models
Source: GitHub Issue #339
Why It Happens: Context caching only supports Gemini 1.5 Pro and Gemini 1.5 Flash models. Documentation examples incorrectly show Gemini 2.0+ models.
Prevention:
typescript
// ❌ WRONG - 404 error:
const cache = await ai.caches.create({
model: 'gemini-2.5-flash', // Not supported
config: { /* ... */ }
});
// ✅ CORRECT - Use Gemini 1.5 with explicit version:
const cache = await ai.caches.create({
model: 'gemini-1.5-flash-001', // Explicit version required
config: { /* ... */ }
});Affected: All Gemini 2.x and 3.x users trying to use context caching
Status: Known limitation, only Gemini 1.5 models support caching
错误: 使用Gemini 2.0、2.5或3.0模型创建缓存时返回404 NOT FOUND
来源: GitHub Issue #339
原因: 上下文缓存仅支持Gemini 1.5 Pro和Gemini 1.5 Flash模型。文档示例错误地展示了Gemini 2.0+模型。
预防措施:
typescript
// ❌ 错误 - 返回404:
const cache = await ai.caches.create({
model: 'gemini-2.5-flash', // 不支持
config: { /* ... */ }
});
// ✅ 正确 - 使用Gemini 1.5并指定明确版本:
const cache = await ai.caches.create({
model: 'gemini-1.5-flash-001', // 需要明确版本
config: { /* ... */ }
});影响范围: 尝试使用上下文缓存的所有Gemini 2.x和3.x用户
状态: 已知限制,仅Gemini 1.5模型支持缓存
Issue #10: Structured Output Occasionally Returns Backticks Causing JSON.parse Error
问题#10: 结构化输出偶尔返回反引号,导致JSON.parse错误
Error: 'responseMimeType: "application/json"JSON.parse()`.
SyntaxError: Unexpected token 'when parsing JSON responses **Source**: [GitHub Issue #976](https://github.com/googleapis/js-genai/issues/976) **Why It Happens**: When using, the response occasionally includes markdown code fence backticks wrapping the JSON (`` ```json\n{...}\n``` ``), breaking Prevention:
typescript
// Strip markdown code fences before parsing:
let jsonText = response.text.trim();
if (jsonText.startsWith('```json')) {
jsonText = jsonText.replace(/^```json\n/, '').replace(/\n```$/, '');
} else if (jsonText.startsWith('```')) {
jsonText = jsonText.replace(/^```\n/, '').replace(/\n```$/, '');
}
const data = JSON.parse(jsonText);Affected: All models when using structured output with
Status: Known intermittent issue, workaround required
responseMimeType: "application/json"错误: 解析JSON响应时出现'responseMimeType: "application/json"JSON.parse()`失败。
SyntaxError: Unexpected token ' **来源**: [GitHub Issue #976](https://github.com/googleapis/js-genai/issues/976) **原因**: 当使用时,响应偶尔会包含包裹JSON的Markdown代码块反引号(`` ```json\n{...}\n``` ``),导致预防措施:
typescript
// 解析前去除Markdown代码块:
let jsonText = response.text.trim();
if (jsonText.startsWith('```json')) {
jsonText = jsonText.replace(/^```json\n/, '').replace(/\n```$/, '');
} else if (jsonText.startsWith('```')) {
jsonText = jsonText.replace(/^```\n/, '').replace(/\n```$/, '');
}
const data = JSON.parse(jsonText);影响范围: 使用的所有模型
状态: 已知间歇性问题,需要使用解决方法
responseMimeType: "application/json"Issue #11: Gemini 3 Temperature Below 1.0 Causes Looping/Degraded Reasoning
问题#11: Gemini 3的temperature低于1.0导致循环/推理质量下降
Error: Infinite loops or degraded reasoning quality on complex tasks
Source: Official Troubleshooting Docs
Why It Happens: Gemini 3 models are optimized for temperature 1.0. Lowering temperature below 1.0 may cause looping behavior or degraded performance on complex mathematical/reasoning tasks.
Prevention:
typescript
// ❌ WRONG - May cause issues with Gemini 3:
const response = await ai.models.generateContent({
model: 'gemini-3-flash',
contents: 'Solve this complex math problem: ...',
config: {
temperature: 0.3 // May cause looping/degradation
}
});
// ✅ CORRECT - Keep default temperature:
const response = await ai.models.generateContent({
model: 'gemini-3-flash',
contents: 'Solve this complex math problem: ...',
config: {
temperature: 1.0 // Recommended for Gemini 3
}
});
// Or omit temperature config entirely (uses default 1.0)Affected: Gemini 3 series models
Status: Official recommendation, keep temperature at 1.0
错误: 复杂任务出现无限循环或推理质量下降
来源: 官方故障排查文档
原因: Gemini 3模型针对temperature 1.0进行优化。将temperature设置为1.0以下可能导致循环行为或复杂数学/推理任务的性能下降。
预防措施:
typescript
// ❌ 错误 - Gemini 3可能出现问题:
const response = await ai.models.generateContent({
model: 'gemini-3-flash',
contents: '解决这个复杂的数学问题: ...',
config: {
temperature: 0.3 // 可能导致循环/质量下降
}
});
// ✅ 正确 - 保持默认temperature:
const response = await ai.models.generateContent({
model: 'gemini-3-flash',
contents: '解决这个复杂的数学问题: ...',
config: {
temperature: 1.0 // Gemini 3推荐值
}
});
// 或完全省略temperature配置(使用默认值1.0)影响范围: Gemini 3系列模型
状态: 官方建议,保持temperature为1.0
Issue #12: Massive Rate Limit Reductions in December 2025 (Free Tier)
问题#12: 2025年12月免费层速率限制大幅降低
Error: Sudden 429 RESOURCE_EXHAUSTED errors after December 6, 2025
Source: LaoZhang AI Blog | HowToGeek
Why It Happens: Google reduced free tier rate limits by 80-90% without wide announcement, catching developers off guard.
Changes:
- Gemini 2.5 Pro: 80% reduction in daily requests (100 RPD, was ~250)
- Gemini 2.5 Flash: ~20 requests per day (was ~250) - 90% reduction
- Free tier now impractical for production
Prevention:
typescript
// For production, upgrade to paid tier:
// https://ai.google.dev/pricing
// For free tier, implement aggressive rate limiting:
const rateLimiter = {
requests: 0,
resetTime: Date.now() + 24 * 60 * 60 * 1000,
async checkLimit() {
if (Date.now() > this.resetTime) {
this.requests = 0;
this.resetTime = Date.now() + 24 * 60 * 60 * 1000;
}
if (this.requests >= 20) {
throw new Error('Daily limit reached');
}
this.requests++;
}
};
await rateLimiter.checkLimit();
const response = await ai.models.generateContent({/* ... */});Affected: Free tier users (December 6, 2025 onwards)
Status: Permanent change, upgrade to paid tier for production
错误: 2025年12月6日后突然出现429 RESOURCE_EXHAUSTED错误
来源: LaoZhang AI Blog | HowToGeek
原因: Google在未广泛通知的情况下将免费层速率限制降低了80-90%,让开发者措手不及。
变化:
- Gemini 2.5 Pro: 每日请求数减少80%(从约250降至100 RPD)
- Gemini 2.5 Flash: 每日请求数减少90%(从约250降至约20 RPD)
- 免费层现在不适用于生产环境
预防措施:
typescript
// 生产环境请升级到付费层:
// https://ai.google.dev/pricing
// 免费层请实施严格的速率限制:
const rateLimiter = {
requests: 0,
resetTime: Date.now() + 24 * 60 * 60 * 1000,
async checkLimit() {
if (Date.now() > this.resetTime) {
this.requests = 0;
this.resetTime = Date.now() + 24 * 60 * 60 * 1000;
}
if (this.requests >= 20) {
throw new Error('已达到每日限制');
}
this.requests++;
}
};
await rateLimiter.checkLimit();
const response = await ai.models.generateContent({/* ... */});影响范围: 2025年12月6日之后的免费层用户
状态: 永久变更,生产环境请升级到付费层
Issue #13: Preview Models Have No SLAs and Can Change Without Warning
问题#13: 预览模型无SLA,可能随时变更
Error: Unexpected behavior changes, deprecation, or service interruptions
Source: Arsturn Blog | Official docs
Why It Happens: Preview and experimental models (e.g., , ) have no service level agreements (SLAs) and are inherently unstable. Google can change or deprecate them with little notice.
gemini-2.5-flash-previewgemini-3-pro-previewPrevention:
typescript
// ❌ WRONG - Using preview models in production:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash-preview', // No SLA!
contents: 'Production traffic'
});
// ✅ CORRECT - Use GA (generally available) models:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash', // Stable, with SLA
contents: 'Production traffic'
});
// Or use specific version numbers for stability:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash-001', // Pinned version
contents: 'Production traffic'
});Affected: Users of preview/experimental models in production
Status: Known limitation, use GA models for production
错误: 意外的行为变更、弃用或服务中断
来源: Arsturn Blog | 官方文档
原因: 预览和实验模型(例如, )无服务级别协议(SLA),本质上不稳定。Google可能随时变更或弃用这些模型,且通知有限。
gemini-2.5-flash-previewgemini-3-pro-preview预防措施:
typescript
// ❌ 错误 - 生产环境使用预览模型:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash-preview', // 无SLA!
contents: '生产流量'
});
// ✅ 正确 - 使用正式可用(GA)模型:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash', // 稳定,有SLA
contents: '生产流量'
});
// 或使用特定版本号以保证稳定性:
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash-001', // 固定版本
contents: '生产流量'
});影响范围: 在生产环境中使用预览/实验模型的用户
状态: 已知限制,生产环境请使用GA模型
Issue #14: API Key Leakage Auto-Blocking (Security Enhancement)
问题#14: API密钥泄露自动拦截(安全增强)
Error: "Invalid API key" after accidentally committing key to GitHub
Source: AI Free API Blog | Official troubleshooting
Why It Happens: Google proactively scans for publicly exposed API keys (e.g., in GitHub repos) and automatically blocks them from accessing the Gemini API as a security measure.
Prevention:
typescript
// Best practices:
// 1. Use .env files (never commit)
// 2. Use environment variables in production
// 3. Rotate keys if exposed
// 4. Use .gitignore:
// .gitignore
.env
.env.local
*.keyAffected: Users who accidentally commit API keys to public repos
Status: Security feature, rotate keys if exposed
错误: 意外将密钥提交到GitHub后提示"Invalid API key"
来源: AI Free API Blog | 官方故障排查
原因: Google主动扫描公开暴露的API密钥(例如GitHub仓库中的密钥),并自动阻止这些密钥访问Gemini API,作为安全措施。
预防措施:
typescript
// 最佳实践:
// 1. 使用.env文件(永远不要提交到仓库)
// 2. 生产环境使用环境变量
// 3. 如果密钥泄露,立即轮换
// 4. 使用.gitignore:
// .gitignore
.env
.env.local
*.key影响范围: 意外将API密钥提交到公共仓库的用户
状态: 安全功能,密钥泄露后请立即轮换
Error Handling
错误处理
Common Errors
常见错误
1. Invalid API Key (401)
1. 无效API密钥(401)
typescript
{
error: {
code: 401,
message: 'API key not valid. Please pass a valid API key.',
status: 'UNAUTHENTICATED'
}
}Solution: Verify environment variable is set correctly.
GEMINI_API_KEYtypescript
{
error: {
code: 401,
message: 'API key not valid. Please pass a valid API key.',
status: 'UNAUTHENTICATED'
}
}解决方案: 验证环境变量是否正确设置。
GEMINI_API_KEY2. Rate Limit Exceeded (429)
2. 超出速率限制(429)
typescript
{
error: {
code: 429,
message: 'Resource has been exhausted (e.g. check quota).',
status: 'RESOURCE_EXHAUSTED'
}
}Solution: Implement exponential backoff retry strategy.
typescript
{
error: {
code: 429,
message: 'Resource has been exhausted (e.g. check quota).',
status: 'RESOURCE_EXHAUSTED'
}
}解决方案: 实现指数退避重试策略。
3. Model Not Found (404)
3. 模型未找到(404)
typescript
{
error: {
code: 404,
message: 'models/gemini-3.0-flash is not found',
status: 'NOT_FOUND'
}
}Solution: Use correct model names: , ,
gemini-2.5-progemini-2.5-flashgemini-2.5-flash-litetypescript
{
error: {
code: 404,
message: 'models/gemini-3.0-flash is not found',
status: 'NOT_FOUND'
}
}解决方案: 使用正确的模型名称: , ,
gemini-2.5-progemini-2.5-flashgemini-2.5-flash-lite4. Context Length Exceeded (400)
4. 超出上下文长度(400)
typescript
{
error: {
code: 400,
message: 'Request payload size exceeds the limit',
status: 'INVALID_ARGUMENT'
}
}Solution: Reduce input size. Gemini 2.5 models support 1,048,576 input tokens max.
typescript
{
error: {
code: 400,
message: 'Request payload size exceeds the limit',
status: 'INVALID_ARGUMENT'
}
}解决方案: 减小输入大小。Gemini 2.5模型最大支持1,048,576输入token。
Exponential Backoff Pattern
指数退避模式
typescript
async function generateWithRetry(request, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await ai.models.generateContent(request);
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}typescript
async function generateWithRetry(request, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await ai.models.generateContent(request);
} catch (error) {
if (error.status === 429 && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000; // 1s, 2s, 4s
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}Rate Limits
速率限制
⚠️ December 2025 Update - Major Free Tier Reductions
⚠️ 2025年12月更新 - 免费层大幅缩减
CRITICAL: Google reduced free tier limits by 80-90% on December 6-7, 2025 without wide announcement. Free tier is now primarily for prototyping only.
Sources: LaoZhang AI | HowToGeek
重要提示: Google在2025年12月6-7日未广泛通知的情况下,将免费层限制降低了80-90%。免费层现在主要用于原型开发。
来源: LaoZhang AI | HowToGeek
Free Tier (Gemini API) - Current Limits
免费层(Gemini API)- 当前限制
Rate limits vary by model:
Gemini 2.5 Pro:
- Requests per minute: 5 RPM
- Tokens per minute: 125,000 TPM
- Requests per day: 100 RPD (was ~250 before Dec 2025) - 80% reduction
Gemini 2.5 Flash:
- Requests per minute: 10 RPM
- Tokens per minute: 250,000 TPM
- Requests per day: ~20 RPD (was ~250 before Dec 2025) - 90% reduction
Gemini 2.5 Flash-Lite:
- Requests per minute: 15 RPM
- Tokens per minute: 250,000 TPM
- Requests per day: 1,000 RPD (unchanged)
速率限制因模型而异:
Gemini 2.5 Pro:
- 每分钟请求数: 5 RPM
- 每分钟token数: 125,000 TPM
- 每日请求数: 100 RPD(2025年12月前约为250)- 减少80%
Gemini 2.5 Flash:
- 每分钟请求数: 10 RPM
- 每分钟token数: 250,000 TPM
- 每日请求数: 约20 RPD(2025年12月前约为250)- 减少90%
Gemini 2.5 Flash-Lite:
- 每分钟请求数: 15 RPM
- 每分钟token数: 250,000 TPM
- 每日请求数: 1,000 RPD(无变化)
Paid Tier (Tier 1)
付费层(Tier 1)
Requires billing account linked to your Google Cloud project.
Gemini 2.5 Pro:
- Requests per minute: 150 RPM
- Tokens per minute: 2,000,000 TPM
- Requests per day: 10,000 RPD
Gemini 2.5 Flash:
- Requests per minute: 1,000 RPM
- Tokens per minute: 1,000,000 TPM
- Requests per day: 10,000 RPD
Gemini 2.5 Flash-Lite:
- Requests per minute: 4,000 RPM
- Tokens per minute: 4,000,000 TPM
- Requests per day: Not specified
需要将结算账户链接到您的Google Cloud项目。
Gemini 2.5 Pro:
- 每分钟请求数: 150 RPM
- 每分钟token数: 2,000,000 TPM
- 每日请求数: 10,000 RPD
Gemini 2.5 Flash:
- 每分钟请求数: 1,000 RPM
- 每分钟token数: 1,000,000 TPM
- 每日请求数: 10,000 RPD
Gemini 2.5 Flash-Lite:
- 每分钟请求数: 4,000 RPM
- 每分钟token数: 4,000,000 TPM
- 每日请求数: 未指定
Higher Tiers (Tier 2 & 3)
更高层级(Tier 2 & 3)
Tier 2 (requires $250+ spending and 30-day wait):
- Even higher limits available
Tier 3 (requires $1,000+ spending and 30-day wait):
- Maximum limits available
Tips:
- Implement rate limit handling with exponential backoff
- Use batch processing for high-volume tasks
- Monitor usage in Google AI Studio
- Choose the right model based on your rate limit needs
- Official rate limits: https://ai.google.dev/gemini-api/docs/rate-limits
Tier 2(需要月消费250美元以上,等待30天):
- 提供更高的限制
Tier 3(需要月消费1,000美元以上,等待30天):
- 提供最高限制
提示:
- 实现带指数退避的速率限制处理
- 高容量任务使用批量处理
- 在Google AI Studio中监控使用情况
- 根据速率限制需求选择合适的模型
- 官方速率限制: https://ai.google.dev/gemini-api/docs/rate-limits
SDK Migration Guide
SDK迁移指南
From @google/generative-ai to @google/genai
从@google/generative-ai迁移到@google/genai
1. Update Package
1. 更新包
bash
undefinedbash
undefinedRemove deprecated SDK
移除已弃用的SDK
npm uninstall @google/generative-ai
npm uninstall @google/generative-ai
Install current SDK
安装当前SDK
npm install @google/genai@1.27.0
undefinednpm install @google/genai@1.27.0
undefined2. Update Imports
2. 更新导入
Old (DEPRECATED):
typescript
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });New (CURRENT):
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// Use ai.models.generateContent() directly旧版(已弃用):
typescript
import { GoogleGenerativeAI } from '@google/generative-ai';
const genAI = new GoogleGenerativeAI(apiKey);
const model = genAI.getGenerativeModel({ model: 'gemini-2.5-flash' });新版(当前):
typescript
import { GoogleGenAI } from '@google/genai';
const ai = new GoogleGenAI({ apiKey });
// 直接使用ai.models.generateContent()3. Update API Calls
3. 更新API调用
Old:
typescript
const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();New:
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: prompt
});
const text = response.text;旧版:
typescript
const result = await model.generateContent(prompt);
const response = await result.response;
const text = response.text();新版:
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: prompt
});
const text = response.text;4. Update Streaming
4. 更新流式输出
Old:
typescript
const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
console.log(chunk.text());
}New:
typescript
const response = await ai.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: prompt
});
for await (const chunk of response) {
console.log(chunk.text);
}旧版:
typescript
const result = await model.generateContentStream(prompt);
for await (const chunk of result.stream) {
console.log(chunk.text());
}新版:
typescript
const response = await ai.models.generateContentStream({
model: 'gemini-2.5-flash',
contents: prompt
});
for await (const chunk of response) {
console.log(chunk.text);
}5. Update Chat
5. 更新对话
Old:
typescript
const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;New:
typescript
const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text is directly available旧版:
typescript
const chat = model.startChat();
const result = await chat.sendMessage(message);
const response = await result.response;新版:
typescript
const chat = await ai.models.createChat({ model: 'gemini-2.5-flash' });
const response = await chat.sendMessage(message);
// response.text可直接获取Production Best Practices
生产环境最佳实践
1. Always Do
1. 必须遵守
✅ Use @google/genai (NOT @google/generative-ai)
✅ Set maxOutputTokens to prevent excessive generation
✅ Implement rate limit handling with exponential backoff
✅ Use environment variables for API keys (never hardcode)
✅ Validate inputs before sending to API (save costs)
✅ Use streaming for better UX on long responses
✅ Choose the right model based on your needs (Pro for complex reasoning, Flash for balance, Flash-Lite for speed)
✅ Handle errors gracefully with try-catch
✅ Monitor token usage for cost control
✅ Use correct model names: gemini-2.5-pro/flash/flash-lite
✅ 使用@google/genai(请勿使用@google/generative-ai)
✅ 设置maxOutputTokens以避免过度生成
✅ 实现带指数退避的速率限制处理
✅ 使用环境变量存储API密钥(永远不要硬编码)
✅ API调用前验证输入(节省成本)
✅ 使用流式输出提升长响应的用户体验
✅ 根据需求选择合适的模型(Pro用于复杂推理,Flash用于平衡,Flash-Lite用于速度)
✅ 优雅处理错误(使用try-catch)
✅ 监控token使用以控制成本
✅ 使用正确的模型名称: gemini-2.5-pro/flash/flash-lite
2. Never Do
2. 禁止操作
❌ Never use @google/generative-ai (deprecated!)
❌ Never hardcode API keys in code
❌ Never claim 2M context for Gemini 2.5 (it's 1,048,576 input tokens)
❌ Never expose API keys in client-side code
❌ Never skip error handling (always try-catch)
❌ Never use generic rate limits (each model has different limits - check official docs)
❌ Never send PII without user consent
❌ Never trust user input without validation
❌ Never ignore rate limits (will get 429 errors)
❌ Never use old model names like gemini-1.5-pro (use 2.5 models)
❌ 永远不要使用@google/generative-ai(已弃用!)
❌ 永远不要在代码中硬编码API密钥
❌ 永远不要声称Gemini 2.5有200万token上下文窗口(实际为1,048,576输入token)
❌ 永远不要在客户端代码中暴露API密钥
❌ 永远不要跳过错误处理(始终使用try-catch)
❌ 永远不要使用通用速率限制(每个模型的限制不同,请查看官方文档)
❌ 未经用户同意永远不要发送个人身份信息(PII)
❌ 永远不要信任未验证的用户输入
❌ 永远不要忽略速率限制(会收到429错误)
❌ 永远不要使用旧模型名称如gemini-1.5-pro(请使用2.5系列模型)
3. Security
3. 安全
- API Key Storage: Use environment variables or secret managers
- Server-Side Only: Never expose API keys in browser JavaScript
- Input Validation: Sanitize all user inputs before API calls
- Rate Limiting: Implement your own rate limits to prevent abuse
- Error Messages: Don't expose API keys or sensitive data in error logs
- API密钥存储: 使用环境变量或密钥管理器
- 仅在服务端使用: 永远不要在浏览器JavaScript中暴露API密钥
- 输入验证: API调用前清理所有用户输入
- 速率限制: 实现自己的速率限制以防止滥用
- 错误消息: 错误日志中不要暴露API密钥或敏感数据
4. Cost Optimization
4. 成本优化
- Choose Right Model: Use Flash for most tasks, Pro only when needed
- Set Token Limits: Use maxOutputTokens to control costs
- Batch Requests: Process multiple items efficiently
- Cache Results: Store responses when appropriate
- Monitor Usage: Track token consumption in Google Cloud Console
- 选择合适的模型: 大多数任务使用Flash,仅在需要时使用Pro
- 设置token限制: 使用maxOutputTokens控制成本
- 批量请求: 高效处理多个项目
- 缓存结果: 适当时存储响应
- 监控使用情况: 在Google Cloud Console中跟踪token消耗
5. Performance
5. 性能
- Use Streaming: Better perceived latency for long responses
- Parallel Requests: Use Promise.all() for independent calls
- Edge Deployment: Deploy to Cloudflare Workers for low latency
- Connection Pooling: Reuse HTTP connections when possible
- 使用流式输出: 提升长响应的感知延迟
- 并行请求: 使用Promise.all()处理独立调用
- 边缘部署: 部署到Cloudflare Workers以降低延迟
- 连接池: 可能的话复用HTTP连接
Quick Reference
快速参考
Installation
安装
bash
npm install @google/genai@1.34.0bash
npm install @google/genai@1.34.0Environment
环境配置
bash
export GEMINI_API_KEY="..."bash
export GEMINI_API_KEY="..."Models (2025-2026)
模型(2025-2026)
- (1,048,576 in / 65,536 out) - NEW Best speed+quality balance
gemini-3-flash - (1,048,576 in / 65,536 out) - Best for complex reasoning
gemini-2.5-pro - (1,048,576 in / 65,536 out) - Proven price-performance balance
gemini-2.5-flash - (1,048,576 in / 65,536 out) - Fastest, most cost-effective
gemini-2.5-flash-lite
- (1,048,576输入 / 65,536输出)- 新增 最佳速度+质量平衡
gemini-3-flash - (1,048,576输入 / 65,536输出)- 复杂推理最佳选择
gemini-2.5-pro - (1,048,576输入 / 65,536输出)- 经过验证的性价比平衡
gemini-2.5-flash - (1,048,576输入 / 65,536输出)- 最快、最具成本效益
gemini-2.5-flash-lite
Basic Generation
基础生成
typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: 'Your prompt here'
});
console.log(response.text);typescript
const response = await ai.models.generateContent({
model: 'gemini-2.5-flash',
contents: '您的提示词'
});
console.log(response.text);Streaming
流式输出
typescript
const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
console.log(chunk.text);
}typescript
const response = await ai.models.generateContentStream({...});
for await (const chunk of response) {
console.log(chunk.text);
}Multimodal
多模态
typescript
contents: [
{
parts: [
{ text: 'What is this?' },
{ inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
]
}
]typescript
contents: [
{
parts: [
{ text: '这是什么?' },
{ inlineData: { data: base64Image, mimeType: 'image/jpeg' } }
]
}
]Function Calling
函数调用
typescript
config: {
tools: [{ functionDeclarations: [...] }]
}Last Updated: 2026-01-21
Production Validated: All features tested with @google/genai@1.35.0
Phase: 2 Complete ✅ (All Core + Advanced Features)
Known Issues: 14 documented errors prevented
Changes: Added Known Issues Prevention section with 14 community-researched findings from post-training-cutoff period (May 2025-Jan 2026)
typescript
config: {
tools: [{ functionDeclarations: [...] }]
}最后更新: 2026-01-21
生产环境验证: 所有功能已通过@google/genai@1.35.0测试
阶段: 第二阶段完成 ✅(所有核心+高级功能)
已知问题: 已预防14个已记录的错误
变更: 新增已知问题预防部分,包含14个社区研究的发现(2025年5月-2026年1月,模型训练截止日期之后的内容)