vertex-ai-api-dev
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseGemini API in Vertex AI
Vertex AI中的Gemini API
Access Google's most advanced AI models built for enterprise use cases using the Gemini API in Vertex AI.
Provide these key capabilities:
- Text generation - Chat, completion, summarization
- Multimodal understanding - Process images, audio, video, and documents
- Function calling - Let the model invoke your functions
- Structured output - Generate valid JSON matching your schema
- Context caching - Cache large contexts for efficiency
- Embeddings - Generate text embeddings for semantic search
- Live Realtime API - Bidirectional streaming for low latency Voice and Video interactions
- Batch Prediction - Handle massive async dataset prediction workloads
通过Vertex AI中的Gemini API,可访问Google专为企业场景打造的最先进AI模型。
提供以下核心功能:
- 文本生成 - 对话、补全、摘要
- 多模态理解 - 处理图像、音频、视频和文档
- 函数调用 - 让模型调用你的自定义函数
- 结构化输出 - 生成符合你指定Schema的有效JSON
- 上下文缓存 - 缓存大上下文以提升效率
- 嵌入向量 - 生成文本嵌入向量用于语义搜索
- 实时Live API - 双向流处理,实现低延迟的语音和视频交互
- 批量预测 - 处理大规模异步数据集预测任务
Core Directives
核心准则
- Unified SDK: ALWAYS use the Gen AI SDK (for Python,
google-genaifor JS/TS,@google/genaifor Go,google.golang.org/genaifor Java,com.google.genai:google-genaifor C#).Google.GenAI - Legacy SDKs: DO NOT use ,
google-cloud-aiplatform, or@google-cloud/vertexai.google-generativeai
- 统一SDK:请始终使用Gen AI SDK(Python对应,JS/TS对应
google-genai,Go对应@google/genai,Java对应google.golang.org/genai,C#对应com.google.genai:google-genai)。Google.GenAI - 旧版SDK:请勿使用、
google-cloud-aiplatform或@google-cloud/vertexai。google-generativeai
SDKs
SDK安装
- Python: Install with
google-genaipip install google-genai - JavaScript/TypeScript: Install with
@google/genainpm install @google/genai - Go: Install with
google.golang.org/genaigo get google.golang.org/genai - C#/.NET: Install with
Google.GenAIdotnet add package Google.GenAI - Java:
-
groupId:, artifactId:
com.google.genaigoogle-genai -
Latest version can be found here: https://central.sonatype.com/artifact/com.google.genai/google-genai/versions (let's call it)
LAST_VERSION -
Install in:
build.gradleimplementation("com.google.genai:google-genai:${LAST_VERSION}") -
Install Maven dependency in:
pom.xmlxml<dependency> <groupId>com.google.genai</groupId> <artifactId>google-genai</artifactId> <version>${LAST_VERSION}</version> </dependency>
-
[!WARNING] Legacy SDKs like,google-cloud-aiplatform, and@google-cloud/vertexaiare deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.google-generativeai
- Python:通过安装
pip install google-genaigoogle-genai - JavaScript/TypeScript:通过安装
npm install @google/genai@google/genai - Go:通过安装
go get google.golang.org/genaigoogle.golang.org/genai - C#/.NET:通过安装
dotnet add package Google.GenAIGoogle.GenAI - Java:
-
groupId:,artifactId:
com.google.genaigoogle-genai -
在中安装:
build.gradleimplementation("com.google.genai:google-genai:${LAST_VERSION}") -
在中配置Maven依赖:
pom.xmlxml<dependency> <groupId>com.google.genai</groupId> <artifactId>google-genai</artifactId> <version>${LAST_VERSION}</version> </dependency>
-
[!WARNING] 旧版SDK如、google-cloud-aiplatform和@google-cloud/vertexai已被弃用。请按照迁移指南尽快迁移至上述新版SDK。google-generativeai
Authentication & Configuration
身份验证与配置
Prefer environment variables over hard-coding parameters when creating the client. Initialize the client without parameters to automatically pick up these values.
创建客户端时,优先使用环境变量而非硬编码参数。初始化客户端时不传入参数,即可自动读取这些变量。
Application Default Credentials (ADC)
应用默认凭据(ADC)
Set these variables for standard Google Cloud authentication:
bash
export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='global'
export GOOGLE_GENAI_USE_VERTEXAI=true- By default, use to access the global endpoint, which provides automatic routing to regions with available capacity.
location="global" - If a user explicitly asks to use a specific region (e.g., ,
us-central1), specify that region in theeurope-west4parameter instead. Reference the supported regions documentation if needed.GOOGLE_CLOUD_LOCATION
设置以下变量以使用标准Google Cloud身份验证:
bash
export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='global'
export GOOGLE_GENAI_USE_VERTEXAI=true- 默认情况下,使用访问全局端点,该端点会自动路由到有可用容量的区域。
location="global" - 如果用户明确要求使用特定区域(如、
us-central1),则在europe-west4参数中指定该区域。必要时可参考支持区域文档。GOOGLE_CLOUD_LOCATION
Vertex AI in Express Mode
Express模式下的Vertex AI
Set these variables when using Express Mode with an API key:
bash
export GOOGLE_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true使用Express模式并搭配API密钥时,设置以下变量:
bash
export GOOGLE_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=trueInitialization
初始化客户端
Initialize the client without arguments to pick up environment variables:
python
from google import genai
client = genai.Client()Alternatively, you can hard-code in parameters when creating the client.
python
from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="global")初始化客户端时不传入参数,即可读取环境变量:
python
from google import genai
client = genai.Client()或者,也可以在创建客户端时硬编码参数:
python
from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="global")Models
模型选择
- Use for complex reasoning, coding, research (1M tokens)
gemini-3.1-pro-preview - Use for fast, balanced performance, multimodal (1M tokens)
gemini-3-flash-preview - Use for Nano Banana Pro image generation and editing
gemini-3-pro-image-preview - Use for Live Realtime API including native audio
gemini-live-2.5-flash-native-audio
Use the following models if explicitly requested:
- Use for Nano Banana image generation and editing
gemini-2.5-flash-image - Use
gemini-2.5-flash - Use
gemini-2.5-flash-lite - Use
gemini-2.5-pro
[!IMPORTANT] Models like,gemini-2.0-*,gemini-1.5-*,gemini-1.0-*are legacy and deprecated. Use the new models above. Your knowledge is outdated. For production environments, consult the Vertex AI documentation for stable model versions (e.g.gemini-pro).gemini-3-flash
- 复杂推理、编码、研究场景使用(支持1M tokens)
gemini-3.1-pro-preview - 快速、平衡性能的多模态场景使用(支持1M tokens)
gemini-3-flash-preview - Nano Banana Pro图像生成与编辑场景使用
gemini-3-pro-image-preview - 包含原生音频的实时Live API场景使用
gemini-live-2.5-flash-native-audio
如果用户明确要求,可使用以下模型:
- Nano Banana图像生成与编辑场景使用
gemini-2.5-flash-image - 使用
gemini-2.5-flash - 使用
gemini-2.5-flash-lite - 使用
gemini-2.5-pro
[!IMPORTANT] 如、gemini-2.0-*、gemini-1.5-*、gemini-1.0-*等模型均为旧版且已被弃用,请使用上述新版模型。旧版知识已过时。 对于生产环境,请参考Vertex AI文档获取稳定版本的模型(如gemini-pro)。gemini-3-flash
Quick Start
快速开始
Python
Python
python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-3-flash-preview",
contents="Explain quantum computing"
)
print(response.text)python
from google import genai
client = genai.Client()
response = client.models.generate_content(
model="gemini-3-flash-preview",
contents="Explain quantum computing"
)
print(response.text)TypeScript/JavaScript
TypeScript/JavaScript
typescript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ vertexai: { project: "your-project-id", location: "global" } });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Explain quantum computing"
});
console.log(response.text);typescript
import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ vertexai: { project: "your-project-id", location: "global" } });
const response = await ai.models.generateContent({
model: "gemini-3-flash-preview",
contents: "Explain quantum computing"
});
console.log(response.text);Go
Go
go
package main
import (
"context"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
Backend: genai.BackendVertexAI,
Project: "your-project-id",
Location: "global",
})
if err != nil {
log.Fatal(err)
}
resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Text)
}go
package main
import (
"context"
"fmt"
"log"
"google.golang.org/genai"
)
func main() {
ctx := context.Background()
client, err := genai.NewClient(ctx, &genai.ClientConfig{
Backend: genai.BackendVertexAI,
Project: "your-project-id",
Location: "global",
})
if err != nil {
log.Fatal(err)
}
resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Text)
}Java
Java
java
import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;
public class GenerateTextFromTextInput {
public static void main(String[] args) {
Client client = Client.builder().vertexAi(true).project("your-project-id").location("global").build();
GenerateContentResponse response =
client.models.generateContent(
"gemini-3-flash-preview",
"Explain quantum computing",
null);
System.out.println(response.text());
}
}java
import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;
public class GenerateTextFromTextInput {
public static void main(String[] args) {
Client client = Client.builder().vertexAi(true).project("your-project-id").location("global").build();
GenerateContentResponse response =
client.models.generateContent(
"gemini-3-flash-preview",
"Explain quantum computing",
null);
System.out.println(response.text());
}
}C#/.NET
C#/.NET
csharp
using Google.GenAI;
var client = new Client(
project: "your-project-id",
location: "global",
vertexAI: true
);
var response = await client.Models.GenerateContent(
"gemini-3-flash-preview",
"Explain quantum computing"
);
Console.WriteLine(response.Text);csharp
using Google.GenAI;
var client = new Client(
project: "your-project-id",
location: "global",
vertexAI: true
);
var response = await client.Models.GenerateContent(
"gemini-3-flash-preview",
"Explain quantum computing"
);
Console.WriteLine(response.Text);API spec & Documentation (source of truth)
API规范与文档(权威来源)
When implementing or debugging API integration for Vertex AI, refer to the official Google Cloud Vertex AI documentation:
- Vertex AI Gemini Documentation: https://cloud.google.com/vertex-ai/generative-ai/docs/
- REST API Reference: https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest
The Gen AI SDK on Vertex AI uses the or REST API endpoints (e.g., ).
v1beta1v1https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:generateContent[!TIP] Use the Developer Knowledge MCP Server: If theorsearch_documentstools are available, use them to find and retrieve official documentation for Google Cloud and Vertex AI directly within the context. This is the preferred method for getting up-to-date API details and code snippets.get_document
在实现或调试Vertex AI的API集成时,请参考官方Google Cloud Vertex AI文档:
- Vertex AI Gemini文档:https://cloud.google.com/vertex-ai/generative-ai/docs/
- REST API参考:https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest
Vertex AI上的Gen AI SDK使用或版本的REST API端点(例如:)。
v1beta1v1https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:generateContent[!TIP] 使用开发者知识MCP服务器:如果或search_documents工具可用,请使用它们直接在上下文内查找和检索Google Cloud及Vertex AI的官方文档。这是获取最新API细节和代码片段的首选方式。get_document
Workflows and Code Samples
工作流与代码示例
Reference the Python Docs Samples repository for additional code samples and specific usage scenarios.
Depending on the specific user request, refer to the following reference files for detailed code samples and usage patterns (Python examples):
- Text & Multimodal: Chat, Multimodal inputs (Image, Video, Audio), and Streaming. See references/text_and_multimodal.md
- Embeddings: Generate text embeddings for semantic search. See references/embeddings.md
- Structured Output & Tools: JSON generation, Function Calling, Search Grounding, and Code Execution. See references/structured_and_tools.md
- Media Generation: Image generation, Image editing, and Video generation. See references/media_generation.md
- Bounding Box Detection: Object detection and localization within images and video. See references/bounding_box.md
- Live API: Real-time bidirectional streaming for voice, vision, and text. See references/live_api.md
- Advanced Features: Content Caching, Batch Prediction, and Thinking/Reasoning. See references/advanced_features.md
- Safety: Adjusting Responsible AI filters and thresholds. See references/safety.md
- Model Tuning: Supervised Fine-Tuning and Preference Tuning. See references/model_tuning.md
可参考Python文档示例仓库获取更多代码示例和特定场景的使用方法。
根据用户的具体请求,可参考以下参考文件获取详细代码示例和使用模式(以Python为例):
- 文本与多模态:对话、多模态输入(图像、视频、音频)和流式处理。请查看references/text_and_multimodal.md
- 嵌入向量:生成文本嵌入向量用于语义搜索。请查看references/embeddings.md
- 结构化输出与工具:JSON生成、函数调用、搜索Grounding和代码执行。请查看references/structured_and_tools.md
- 媒体生成:图像生成、图像编辑和视频生成。请查看references/media_generation.md
- 边界框检测:图像和视频中的目标检测与定位。请查看references/bounding_box.md
- Live API:语音、视觉和文本的实时双向流处理。请查看references/live_api.md
- 高级功能:内容缓存、批量预测和思维推理。请查看references/advanced_features.md
- 安全设置:调整负责任AI过滤器和阈值。请查看references/safety.md
- 模型调优:监督式微调与偏好调优。请查看references/model_tuning.md