gemini-3-pro-api

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Gemini 3 Pro API Integration

Gemini 3 Pro API 集成指南

Comprehensive guide for integrating Google's Gemini 3 Pro API/SDK into your applications. Covers setup, authentication, text generation, advanced reasoning with dynamic thinking, chat applications, streaming responses, and production deployment patterns.

本文是将 Google Gemini 3 Pro API/SDK 集成到应用中的全面指南，涵盖设置、身份验证、文本生成、基于动态思维的高级推理、聊天应用、流式响应以及生产环境部署模式。

Overview

概述

Gemini 3 Pro (

gemini-3-pro-preview

) is Google's most intelligent model designed for complex tasks requiring advanced reasoning and broad world knowledge. This skill provides complete workflows for API integration using Python or Node.js SDKs.

Gemini 3 Pro（

gemini-3-pro-preview

）是 Google 推出的最智能模型，专为需要高级推理能力和广泛知识库的复杂任务设计。本指南提供了使用 Python 或 Node.js SDK 进行 API 集成的完整工作流。

Key Capabilities

核心能力

Massive Context: 1M token input, 64k token output
Dynamic Thinking: Adaptive reasoning with high/low modes
Streaming: Real-time token delivery
Chat: Multi-turn conversations with history
Production-Ready: Error handling, retry logic, cost optimization

超大上下文窗口：100万 token 输入，64000 token 输出
动态思维：支持高/低模式的自适应推理
流式响应：实时返回 token
多轮聊天：支持带历史记录的对话
生产级就绪：包含错误处理、重试逻辑、成本优化

When to Use This Skill

适用场景

Setting up Gemini 3 Pro API access
Building text generation applications
Implementing chat applications with reasoning
Configuring advanced thinking modes
Deploying production Gemini applications
Optimizing API usage and costs

配置 Gemini 3 Pro API 访问权限
构建文本生成应用
开发带推理能力的聊天应用
配置高级思维模式
部署生产级 Gemini 应用
优化 API 使用效率和成本

Quick Start

快速开始

Prerequisites

前置条件

API Key: Get from Google AI Studio
Python 3.9+ or Node.js 18+

API 密钥：从 Google AI Studio 获取
Python 3.9+ 或 Node.js 18+

Python Quick Start

Python 快速入门

python

undefined

python

undefined

Install SDK

pip install google-genai

Basic usage

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY") model = genai.GenerativeModel("gemini-3-pro-preview")

response = model.generate_content("Explain quantum computing") print(response.text)

undefined

import google.generativeai as genai

genai.configure(api_key="YOUR_API_KEY") model = genai.GenerativeModel("gemini-3-pro-preview")

response = model.generate_content("Explain quantum computing") print(response.text)

undefined

Node.js Quick Start

Node.js 快速入门

typescript

// Install SDK
npm install @google/generative-ai

// Basic usage
import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-3-pro-preview" });

const result = await model.generateContent("Explain quantum computing");
console.log(result.response.text());

typescript

// Install SDK
npm install @google/generative-ai

// Basic usage
import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("YOUR_API_KEY");
const model = genAI.getGenerativeModel({ model: "gemini-3-pro-preview" });

const result = await model.generateContent("Explain quantum computing");
console.log(result.response.text());

Core Workflows

核心工作流

Workflow 1: Quick Start Setup

工作流1：快速设置

Goal: Get from zero to first successful API call in < 5 minutes.

Steps:

Get API Key
- Visit Google AI Studio
- Create or select project
- Generate API key
- Copy key securely

Install SDK

bash

# Python
pip install google-genai

# Node.js
npm install @google/generative-ai

Configure Authentication

python

# Python - using environment variable (recommended)
import os
import google.generativeai as genai

genai.configure(api_key=os.getenv("GEMINI_API_KEY"))

typescript

// Node.js - using environment variable (recommended)
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);

Make First API Call

python

# Python
model = genai.GenerativeModel("gemini-3-pro-preview")
response = model.generate_content("Write a haiku about coding")
print(response.text)

Verify Success
- Check response received
- Verify text output
- Note token usage
- Confirm API key working

Expected Outcome: Working API integration in under 5 minutes.

目标：在5分钟内完成从0到首次成功调用API的流程

步骤：

获取API密钥
- 访问 Google AI Studio
- 创建或选择项目
- 生成API密钥
- 安全保存密钥

安装SDK

bash

# Python
pip install google-genai

# Node.js
npm install @google/generative-ai

配置身份验证

python

# Python - 使用环境变量（推荐）
import os
import google.generativeai as genai

genai.configure(api_key=os.getenv("GEMINI_API_KEY"))

typescript

// Node.js - 使用环境变量（推荐）
const genAI = new GoogleGenerativeAI(process.env.GEMINI_API_KEY);

首次调用API

python

# Python
model = genai.GenerativeModel("gemini-3-pro-preview")
response = model.generate_content("Write a haiku about coding")
print(response.text)

验证成功
- 检查是否收到响应
- 验证文本输出
- 记录token使用量
- 确认API密钥可用

预期结果：在5分钟内完成可正常工作的API集成

Workflow 2: Chat Application Development

工作流2：聊天应用开发

Goal: Build a production-ready chat application with conversation history and streaming.

Steps:

Initialize Chat Model

python

# Python
model = genai.GenerativeModel(
    "gemini-3-pro-preview",
    generation_config={
        "thinking_level": "high",  # Dynamic reasoning
        "temperature": 1.0,  # Keep at 1.0 for best results
        "max_output_tokens": 8192
    }
)

Start Chat Session
python
```
chat = model.start_chat(history=[])
```

Send Message with Streaming

python

response = chat.send_message(
    "Explain how neural networks learn",
    stream=True
)

# Stream tokens in real-time
for chunk in response:
    print(chunk.text, end="", flush=True)

Manage Conversation History

python

# History is automatically maintained
# Access it anytime
print(f"Conversation turns: {len(chat.history)}")

# Continue conversation
response = chat.send_message("Can you give an example?")

Handle Thought Signatures
- SDKs handle automatically in standard chat flows
- No manual intervention needed for basic use
- See
```
references/thought-signatures.md
```
  for advanced cases

Implement Error Handling

python

import time
from google.api_core import retry, exceptions

@retry.Retry(predicate=retry.if_exception_type(
    exceptions.ResourceExhausted,
    exceptions.ServiceUnavailable
))
def send_with_retry(chat, message):
    return chat.send_message(message)

try:
    response = send_with_retry(chat, user_input)
except exceptions.GoogleAPIError as e:
    print(f"API error: {e}")

Expected Outcome: Production-ready chat application with streaming, history, and error handling.

目标：构建带有对话历史和流式响应的生产级聊天应用

步骤：

初始化聊天模型

python

# Python
model = genai.GenerativeModel(
    "gemini-3-pro-preview",
    generation_config={
        "thinking_level": "high",  # 动态推理
        "temperature": 1.0,  # 保持1.0以获得最佳效果
        "max_output_tokens": 8192
    }
)

启动聊天会话
python
```
chat = model.start_chat(history=[])
```

发送带流式响应的消息

python

response = chat.send_message(
    "Explain how neural networks learn",
    stream=True
)

# 实时流式输出token
for chunk in response:
    print(chunk.text, end="", flush=True)

管理对话历史

python

# 历史记录会自动维护
# 随时可访问
print(f"Conversation turns: {len(chat.history)}")

# 继续对话
response = chat.send_message("Can you give an example?")

处理思维签名
- SDK会在标准聊天流程中自动处理
- 基础使用场景无需手动干预
- 高级场景请参考
```
references/thought-signatures.md
```

实现错误处理

python

import time
from google.api_core import retry, exceptions

@retry.Retry(predicate=retry.if_exception_type(
    exceptions.ResourceExhausted,
    exceptions.ServiceUnavailable
))
def send_with_retry(chat, message):
    return chat.send_message(message)

try:
    response = send_with_retry(chat, user_input)
except exceptions.GoogleAPIError as e:
    print(f"API error: {e}")

预期结果：带有流式响应、历史记录和错误处理的生产级聊天应用

Workflow 3: Production Deployment

工作流3：生产环境部署

Goal: Deploy Gemini 3 Pro integration with monitoring, cost control, and reliability.

Steps:

Setup Authentication (Production)

python

# Use environment variables (never hardcode keys)
import os
from pathlib import Path

# Option 1: Environment variable
api_key = os.getenv("GEMINI_API_KEY")

# Option 2: Secrets manager (recommended for production)
# Use Google Secret Manager, AWS Secrets Manager, etc.

Configure Production Settings

python

model = genai.GenerativeModel(
    "gemini-3-pro-preview",
    generation_config={
        "thinking_level": "high",  # or "low" for simple tasks
        "temperature": 1.0,  # CRITICAL: Keep at 1.0
        "max_output_tokens": 4096,
        "top_p": 0.95,
        "top_k": 40
    },
    safety_settings={
        # Configure content filtering as needed
    }
)

Implement Comprehensive Error Handling

python

from google.api_core import exceptions, retry
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def generate_with_fallback(prompt, max_retries=3):
    @retry.Retry(
        predicate=retry.if_exception_type(
            exceptions.ResourceExhausted,
            exceptions.ServiceUnavailable,
            exceptions.DeadlineExceeded
        ),
        initial=1.0,
        maximum=10.0,
        multiplier=2.0,
        deadline=60.0
    )
    def _generate():
        return model.generate_content(prompt)

    try:
        return _generate()
    except exceptions.InvalidArgument as e:
        logger.error(f"Invalid argument: {e}")
        raise
    except exceptions.PermissionDenied as e:
        logger.error(f"Permission denied: {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        # Fallback to simpler model or cached response
        return None

Monitor Usage and Costs

python

def log_usage(response):
    usage = response.usage_metadata
    logger.info(f"Tokens - Input: {usage.prompt_token_count}, "
                f"Output: {usage.candidates_token_count}, "
                f"Total: {usage.total_token_count}")

    # Estimate cost (for prompts ≤200k tokens)
    input_cost = (usage.prompt_token_count / 1_000_000) * 2.00
    output_cost = (usage.candidates_token_count / 1_000_000) * 12.00
    total_cost = input_cost + output_cost

    logger.info(f"Estimated cost: ${total_cost:.6f}")

response = model.generate_content(prompt)
log_usage(response)

Implement Rate Limiting

python

import time
from collections import deque

class RateLimiter:
    def __init__(self, max_requests_per_minute=60):
        self.max_rpm = max_requests_per_minute
        self.requests = deque()

    def wait_if_needed(self):
        now = time.time()
        # Remove requests older than 1 minute
        while self.requests and self.requests[0] < now - 60:
            self.requests.popleft()

        # Check if at limit
        if len(self.requests) >= self.max_rpm:
            sleep_time = 60 - (now - self.requests[0])
            if sleep_time > 0:
                time.sleep(sleep_time)

        self.requests.append(now)

limiter = RateLimiter(max_requests_per_minute=60)

def generate_with_rate_limit(prompt):
    limiter.wait_if_needed()
    return model.generate_content(prompt)

Setup Logging and Monitoring

python

import logging
from datetime import datetime

# Configure logging
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('gemini_api.log'),
        logging.StreamHandler()
    ]
)

logger = logging.getLogger(__name__)

def monitored_generate(prompt):
    start_time = datetime.now()
    try:
        response = model.generate_content(prompt)
        duration = (datetime.now() - start_time).total_seconds()

        logger.info(f"Success - Duration: {duration}s, "
                    f"Tokens: {response.usage_metadata.total_token_count}")
        return response
    except Exception as e:
        duration = (datetime.now() - start_time).total_seconds()
        logger.error(f"Failed - Duration: {duration}s, Error: {e}")
        raise

Expected Outcome: Production-ready deployment with monitoring, cost control, error handling, and rate limiting.

目标：部署带有监控、成本控制和可靠性保障的Gemini 3 Pro集成

步骤：

生产环境身份验证设置

python

# 使用环境变量（绝对不要硬编码密钥）
import os
from pathlib import Path

# 选项1：环境变量
api_key = os.getenv("GEMINI_API_KEY")

# 选项2：密钥管理器（生产环境推荐）
# 使用Google Secret Manager、AWS Secrets Manager等

配置生产环境设置

python

model = genai.GenerativeModel(
    "gemini-3-pro-preview",
    generation_config={
        "thinking_level": "high",  # 简单任务用"low"
        "temperature": 1.0,  # 关键：保持1.0
        "max_output_tokens": 4096,
        "top_p": 0.95,
        "top_k": 40
    },
    safety_settings={
        # 根据需要配置内容过滤
    }
)

实现全面的错误处理

python

from google.api_core import exceptions, retry
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

def generate_with_fallback(prompt, max_retries=3):
    @retry.Retry(
        predicate=retry.if_exception_type(
            exceptions.ResourceExhausted,
            exceptions.ServiceUnavailable,
            exceptions.DeadlineExceeded
        ),
        initial=1.0,
        maximum=10.0,
        multiplier=2.0,
        deadline=60.0
    )
    def _generate():
        return model.generate_content(prompt)

    try:
        return _generate()
    except exceptions.InvalidArgument as e:
        logger.error(f"Invalid argument: {e}")
        raise
    except exceptions.PermissionDenied as e:
        logger.error(f"Permission denied: {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        # 降级到更简单的模型或使用缓存响应
        return None

监控使用量和成本

python

def log_usage(response):
    usage = response.usage_metadata
    logger.info(f"Tokens - Input: {usage.prompt_token_count}, "
                f"Output: {usage.candidates_token_count}, "
                f"Total: {usage.total_token_count}")

    # 估算成本（适用于≤20万token的请求）
    input_cost = (usage.prompt_token_count / 1_000_000) * 2.00
    output_cost = (usage.candidates_token_count / 1_000_000) * 12.00
    total_cost = input_cost + output_cost

    logger.info(f"Estimated cost: ${total_cost:.6f}")

response = model.generate_content(prompt)
log_usage(response)

实现速率限制

python

import time
from collections import deque

class RateLimiter:
    def __init__(self, max_requests_per_minute=60):
        self.max_rpm = max_requests_per_minute
        self.requests = deque()

    def wait_if_needed(self):
        now = time.time()
        # 删除1分钟前的请求记录
        while self.requests and self.requests[0] < now - 60:
            self.requests.popleft()

        # 检查是否达到限制
        if len(self.requests) >= self.max_rpm:
            sleep_time = 60 - (now - self.requests[0])
            if sleep_time > 0:
                time.sleep(sleep_time)

        self.requests.append(now)

limiter = RateLimiter(max_requests_per_minute=60)

def generate_with_rate_limit(prompt):
    limiter.wait_if_needed()
    return model.generate_content(prompt)

设置日志和监控

python

import logging
from datetime import datetime

# 配置日志
logging.basicConfig(
    level=logging.INFO,
    format='%(asctime)s - %(name)s - %(levelname)s - %(message)s',
    handlers=[
        logging.FileHandler('gemini_api.log'),
        logging.StreamHandler()
    ]
)

logger = logging.getLogger(__name__)

def monitored_generate(prompt):
    start_time = datetime.now()
    try:
        response = model.generate_content(prompt)
        duration = (datetime.now() - start_time).total_seconds()

        logger.info(f"Success - Duration: {duration}s, "
                    f"Tokens: {response.usage_metadata.total_token_count}")
        return response
    except Exception as e:
        duration = (datetime.now() - start_time).total_seconds()
        logger.error(f"Failed - Duration: {duration}s, Error: {e}")
        raise

预期结果：带有监控、成本控制、错误处理和速率限制的生产级部署

Thinking Levels

思维层级

Dynamic Thinking System

动态思维系统

Gemini 3 Pro introduces

thinking_level

to control reasoning depth:

thinking_level: "high"
(default)

Maximum reasoning depth
Best quality for complex tasks
Slower first-token response
Higher cost
Use for: Complex reasoning, coding, analysis, research

thinking_level: "low"

Minimal reasoning overhead
Faster response
Lower cost
Simpler output
Use for: Simple questions, factual answers, quick queries

Gemini 3 Pro 引入了

thinking_level

参数来控制推理深度：

thinking_level: "high"
（默认）

最大推理深度
复杂任务的最佳质量
首个token响应速度较慢
成本较高
适用场景：复杂推理、编码、分析、研究

thinking_level: "low"

最小推理开销
响应速度更快
成本更低
输出更简洁
适用场景：简单问题、事实性回答、快速查询

Configuration

配置方式

python

undefined

python

undefined

Python

model = genai.GenerativeModel( "gemini-3-pro-preview", generation_config={ "thinking_level": "high" # or "low" } )


```typescript
// Node.js
const model = genAI.getGenerativeModel({
  model: "gemini-3-pro-preview",
  generationConfig: {
    thinking_level: "high"  // or "low"
  }
});

model = genai.GenerativeModel( "gemini-3-pro-preview", generation_config={ "thinking_level": "high" # 或 "low" } )


```typescript
// Node.js
const model = genAI.getGenerativeModel({
  model: "gemini-3-pro-preview",
  generationConfig: {
    thinking_level: "high"  // 或 "low"
  }
});

Critical Notes

重要注意事项

⚠️ Temperature MUST stay at 1.0 - Changing temperature can cause looping or degraded performance on complex reasoning tasks.

⚠️ Cannot combine

thinking_level

with legacy

thinking_budget

parameter.

See

references/thinking-levels.md

for detailed guide.

⚠️ temperature必须保持为1.0 - 修改temperature可能导致复杂推理任务出现循环或性能下降

⚠️ 不能同时使用

thinking_level

和旧版

thinking_budget

参数

详细指南请参考

references/thinking-levels.md

Streaming Responses

流式响应

Python Streaming

Python 流式实现

python

response = model.generate_content(
    "Write a long article about AI",
    stream=True
)

for chunk in response:
    print(chunk.text, end="", flush=True)

python

response = model.generate_content(
    "Write a long article about AI",
    stream=True
)

for chunk in response:
    print(chunk.text, end="", flush=True)

Node.js Streaming

Node.js 流式实现

typescript

const result = await model.generateContentStream("Write a long article about AI");

for await (const chunk of result.stream) {
    process.stdout.write(chunk.text());
}

typescript

const result = await model.generateContentStream("Write a long article about AI");

for await (const chunk of result.stream) {
    process.stdout.write(chunk.text());
}

Benefits

优势

Lower perceived latency
Real-time user feedback
Better UX for long responses
Can process tokens as they arrive

See

references/streaming.md

for advanced patterns.

降低感知延迟
实时用户反馈
长响应内容的更好用户体验
可在token生成时进行处理

高级模式请参考

references/streaming.md

Cost Optimization

成本优化

Pricing (Gemini 3 Pro)

定价（Gemini 3 Pro）

Context Size	Input	Output
≤ 200k tokens	$2/1M	$12/1M
> 200k tokens	$4/1M	$18/1M

上下文规模	输入价格	输出价格
≤ 20万 token	$2/100万	$12/100万
> 20万 token	$4/100万	$18/100万

Optimization Strategies

优化策略

Keep prompts under 200k tokens (50% cheaper)
Use
thinking_level: "low"
for simple tasks (faster, lower cost)
Implement context caching for reusable contexts (see
```
gemini-3-advanced
```
skill)
Monitor token usage and set budgets
Use Gemini 1.5 Flash for simple tasks (20x cheaper)

See

references/best-practices.md

for comprehensive cost optimization.

保持请求在20万token以内（成本降低50%）
简单任务使用
thinking_level: "low"
（更快、成本更低）
实现上下文缓存（参考
```
gemini-3-advanced
```
技能）
监控token使用量并设置预算
简单任务使用Gemini 1.5 Flash（成本低20倍）

全面的成本优化指南请参考

references/best-practices.md

Model Selection

模型选择

Gemini 3 Pro vs Other Models

Gemini 3 Pro vs 其他模型

Model	Context	Output	Input Price	Best For
gemini-3-pro-preview	1M	64k	$2-4/1M	Complex reasoning, coding
gemini-1.5-pro	1M	8k	$7-14/1M	General use, multimodal
gemini-1.5-flash	1M	8k	$0.35-0.70/1M	Simple tasks, cost-sensitive

模型	上下文	输出	输入价格	最佳适用场景
gemini-3-pro-preview	100万	64000	$2-4/100万	复杂推理、编码
gemini-1.5-pro	100万	8000	$7-14/100万	通用场景、多模态
gemini-1.5-flash	100万	8000	$0.35-0.70/100万	简单任务、成本敏感场景

When to Use Gemini 3 Pro

何时使用Gemini 3 Pro

✅ Complex reasoning tasks ✅ Advanced coding problems ✅ Long-context analysis (up to 1M tokens) ✅ Large output requirements (up to 64k tokens) ✅ Tasks requiring dynamic thinking

✅ 复杂推理任务 ✅ 高级编码问题 ✅ 长上下文分析（最多100万token） ✅ 大输出需求（最多64000token） ✅ 需要动态思维的任务

When to Use Alternatives

何时使用其他模型

Gemini 1.5 Flash: Simple tasks, cost-sensitive applications
Gemini 1.5 Pro: Multimodal tasks, general use
Gemini 2.5 models: Experimental features, specific capabilities

Gemini 1.5 Flash：简单任务、成本敏感型应用
Gemini 1.5 Pro：多模态任务、通用场景
Gemini 2.5模型：实验性功能、特定能力需求

Error Handling

错误处理

Common Errors

常见错误

Error	Cause	Solution
`ResourceExhausted`	Rate limit exceeded	Implement retry with backoff
`InvalidArgument`	Invalid parameters	Validate input, check docs
`PermissionDenied`	Invalid API key	Check authentication
`DeadlineExceeded`	Request timeout	Reduce context, retry

错误类型	原因	解决方案
`ResourceExhausted`	超出速率限制	实现带退避策略的重试
`InvalidArgument`	参数无效	验证输入、查阅文档
`PermissionDenied`	API密钥无效	检查身份验证信息
`DeadlineExceeded`	请求超时	减少上下文规模、重试

Production Error Handling

生产环境错误处理

python

from google.api_core import exceptions, retry

@retry.Retry(
    predicate=retry.if_exception_type(
        exceptions.ResourceExhausted,
        exceptions.ServiceUnavailable
    ),
    initial=1.0,
    maximum=60.0,
    multiplier=2.0
)
def safe_generate(prompt):
    try:
        return model.generate_content(prompt)
    except exceptions.InvalidArgument as e:
        logger.error(f"Invalid argument: {e}")
        raise
    except exceptions.PermissionDenied as e:
        logger.error(f"Permission denied - check API key: {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        raise

See

references/error-handling.md

for comprehensive patterns.

python

from google.api_core import exceptions, retry

@retry.Retry(
    predicate=retry.if_exception_type(
        exceptions.ResourceExhausted,
        exceptions.ServiceUnavailable
    ),
    initial=1.0,
    maximum=60.0,
    multiplier=2.0
)
def safe_generate(prompt):
    try:
        return model.generate_content(prompt)
    except exceptions.InvalidArgument as e:
        logger.error(f"Invalid argument: {e}")
        raise
    except exceptions.PermissionDenied as e:
        logger.error(f"Permission denied - check API key: {e}")
        raise
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        raise

全面的错误处理模式请参考

references/error-handling.md

References

参考资料

Setup & Configuration

Setup Guide - Installation, authentication, configuration
Best Practices - Optimization, cost control, tips

Features

Text Generation - Detailed text generation patterns
Chat Patterns - Chat conversation management
Thinking Levels - Dynamic thinking system guide
Streaming - Streaming response patterns

Production

Error Handling - Error handling and retry strategies

Official Resources

设置与配置

设置指南 - 安装、身份验证、配置
最佳实践 - 优化、成本控制、技巧

功能模块

文本生成 - 详细的文本生成模式
聊天模式 - 对话管理指南
思维层级 - 动态思维系统指南
流式响应 - 流式响应模式

生产环境

错误处理 - 错误处理和重试策略

官方资源

Next Steps

下一步

After Basic Setup

基础设置完成后

Explore chat applications - Build conversational interfaces
Add multimodal capabilities - Use
```
gemini-3-multimodal
```
skill
Add image generation - Use
```
gemini-3-image-generation
```
skill
Add advanced features - Use
```
gemini-3-advanced
```
skill (caching, tools, batch)

探索聊天应用 - 构建对话式界面
添加多模态能力 - 使用
```
gemini-3-multimodal
```
技能
添加图像生成 - 使用
```
gemini-3-image-generation
```
技能
添加高级功能 - 使用
```
gemini-3-advanced
```
技能（缓存、工具、批量处理）

Common Integration Patterns

常见集成模式

Simple Chatbot: This skill only
Multimodal Assistant: This skill +
```
gemini-3-multimodal
```
Creative Bot: This skill +
```
gemini-3-image-generation
```
Production App: All 4 Gemini 3 skills

简单聊天机器人：仅使用本技能
多模态助手：本技能 +
```
gemini-3-multimodal
```
创意机器人：本技能 +
```
gemini-3-image-generation
```
生产级应用：全部4个Gemini 3技能

Troubleshooting

故障排除

Issue: API key not working

问题：API密钥无效

Solution: Verify API key in Google AI Studio, check environment variable

解决方案：在Google AI Studio中验证API密钥，检查环境变量配置

Issue: Rate limit errors

问题：速率限制错误

Solution: Implement rate limiting, upgrade to paid tier, reduce request frequency

解决方案：实现速率限制、升级到付费套餐、降低请求频率

Issue: Slow responses

问题：响应缓慢

Solution: Use

thinking_level: "low"

for simple tasks, enable streaming, reduce context size

解决方案：简单任务使用

thinking_level: "low"

、启用流式响应、减少上下文规模

Issue: High costs

问题：成本过高

Solution: Keep prompts under 200k tokens, use appropriate thinking level, consider Gemini 1.5 Flash for simple tasks

解决方案：保持请求在20万token以内、使用合适的思维层级、简单任务考虑使用Gemini 1.5 Flash

Issue: Temperature warnings

问题：temperature警告

Solution: Keep temperature at 1.0 (default) - do not modify for complex reasoning tasks

解决方案：保持temperature为1.0（默认值）- 复杂推理任务不要修改

Summary

总结

This skill provides everything needed to integrate Gemini 3 Pro API into your applications:

✅ Quick setup (< 5 minutes) ✅ Production-ready chat applications ✅ Dynamic thinking configuration ✅ Streaming responses ✅ Error handling and retry logic ✅ Cost optimization strategies ✅ Monitoring and logging patterns

For multimodal, image generation, and advanced features, see the companion skills.

Ready to build? Start with Workflow 1: Quick Start Setup above!

本技能提供了将Gemini 3 Pro API集成到应用中的所有必要内容：

✅ 快速设置（<5分钟） ✅ 生产级聊天应用 ✅ 动态思维配置 ✅ 流式响应 ✅ 错误处理和重试逻辑 ✅ 成本优化策略 ✅ 监控和日志模式

如需多模态、图像生成和高级功能，请参考配套技能。