vertex-ai-api-dev

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Gemini API in Vertex AI

Vertex AI中的Gemini API

Access Google's most advanced AI models built for enterprise use cases using the Gemini API in Vertex AI.

Provide these key capabilities:

Text generation - Chat, completion, summarization
Multimodal understanding - Process images, audio, video, and documents
Function calling - Let the model invoke your functions
Structured output - Generate valid JSON matching your schema
Context caching - Cache large contexts for efficiency
Embeddings - Generate text embeddings for semantic search
Live Realtime API - Bidirectional streaming for low latency Voice and Video interactions
Batch Prediction - Handle massive async dataset prediction workloads

通过Vertex AI中的Gemini API，可访问Google专为企业场景打造的最先进AI模型。

提供以下核心功能：

文本生成 - 对话、补全、摘要
多模态理解 - 处理图像、音频、视频和文档
函数调用 - 让模型调用你的自定义函数
结构化输出 - 生成符合你指定Schema的有效JSON
上下文缓存 - 缓存大上下文以提升效率
嵌入向量 - 生成文本嵌入向量用于语义搜索
实时Live API - 双向流处理，实现低延迟的语音和视频交互
批量预测 - 处理大规模异步数据集预测任务

Core Directives

核心准则

Unified SDK: ALWAYS use the Gen AI SDK (

google-genai

for Python,

@google/genai

for JS/TS,

google.golang.org/genai

for Go,

com.google.genai:google-genai

for Java,

Google.GenAI

for C#).

Legacy SDKs: DO NOT use

google-cloud-aiplatform

@google-cloud/vertexai

, or

google-generativeai

统一SDK：请始终使用Gen AI SDK（Python对应
```
google-genai
```
，JS/TS对应
```
@google/genai
```
，Go对应
```
google.golang.org/genai
```
，Java对应
```
com.google.genai:google-genai
```
，C#对应
```
Google.GenAI
```
）。

旧版SDK：请勿使用

google-cloud-aiplatform

、

@google-cloud/vertexai

或

google-generativeai

。

SDKs

SDK安装

Python: Install
```
google-genai
```
with
```
pip install google-genai
```
JavaScript/TypeScript: Install
```
@google/genai
```
with
```
npm install @google/genai
```

Go: Install

google.golang.org/genai

with

go get google.golang.org/genai

C#/.NET: Install

Google.GenAI

with

dotnet add package Google.GenAI

Java:

groupId:
```
com.google.genai
```
, artifactId:
```
google-genai
```
Latest version can be found here: https://central.sonatype.com/artifact/com.google.genai/google-genai/versions (let's call it
```
LAST_VERSION
```
)

Install in

build.gradle

implementation("com.google.genai:google-genai:${LAST_VERSION}")

Install Maven dependency in

pom.xml

xml

<dependency>
    <groupId>com.google.genai</groupId>
    <artifactId>google-genai</artifactId>
    <version>${LAST_VERSION}</version>
</dependency>

[!WARNING] Legacy SDKs like
google-cloud-aiplatform
,
@google-cloud/vertexai
, and
google-generativeai
are deprecated. Migrate to the new SDKs above urgently by following the Migration Guide.

Python：通过
```
pip install google-genai
```
安装
```
google-genai
```
JavaScript/TypeScript：通过
```
npm install @google/genai
```
安装
```
@google/genai
```

Go：通过

go get google.golang.org/genai

安装

google.golang.org/genai

C#/.NET：通过

dotnet add package Google.GenAI

安装

Google.GenAI

Java:

groupId:
```
com.google.genai
```
，artifactId:
```
google-genai
```
最新版本可在此处查看：https://central.sonatype.com/artifact/com.google.genai/google-genai/versions（以下简称`LAST_VERSION`）

在

build.gradle

中安装：

implementation("com.google.genai:google-genai:${LAST_VERSION}")

在

pom.xml

中配置Maven依赖：

xml

<dependency>
    <groupId>com.google.genai</groupId>
    <artifactId>google-genai</artifactId>
    <version>${LAST_VERSION}</version>
</dependency>

[!WARNING] 旧版SDK如
google-cloud-aiplatform
、
@google-cloud/vertexai
和
google-generativeai
已被弃用。请按照迁移指南尽快迁移至上述新版SDK。

Authentication & Configuration

身份验证与配置

Prefer environment variables over hard-coding parameters when creating the client. Initialize the client without parameters to automatically pick up these values.

创建客户端时，优先使用环境变量而非硬编码参数。初始化客户端时不传入参数，即可自动读取这些变量。

Application Default Credentials (ADC)

应用默认凭据（ADC）

Set these variables for standard Google Cloud authentication:

bash

export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='global'
export GOOGLE_GENAI_USE_VERTEXAI=true

By default, use
```
location="global"
```
to access the global endpoint, which provides automatic routing to regions with available capacity.
If a user explicitly asks to use a specific region (e.g.,
```
us-central1
```
,
```
europe-west4
```
), specify that region in the
```
GOOGLE_CLOUD_LOCATION
```
parameter instead. Reference the supported regions documentation if needed.

设置以下变量以使用标准Google Cloud身份验证：

bash

export GOOGLE_CLOUD_PROJECT='your-project-id'
export GOOGLE_CLOUD_LOCATION='global'
export GOOGLE_GENAI_USE_VERTEXAI=true

默认情况下，使用
```
location="global"
```
访问全局端点，该端点会自动路由到有可用容量的区域。
如果用户明确要求使用特定区域（如
```
us-central1
```
、
```
europe-west4
```
），则在
```
GOOGLE_CLOUD_LOCATION
```
参数中指定该区域。必要时可参考支持区域文档。

Vertex AI in Express Mode

Express模式下的Vertex AI

Set these variables when using Express Mode with an API key:

bash

export GOOGLE_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true

使用Express模式并搭配API密钥时，设置以下变量：

bash

export GOOGLE_API_KEY='your-api-key'
export GOOGLE_GENAI_USE_VERTEXAI=true

Initialization

初始化客户端

Initialize the client without arguments to pick up environment variables:

python

from google import genai
client = genai.Client()

Alternatively, you can hard-code in parameters when creating the client.

python

from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="global")

初始化客户端时不传入参数，即可读取环境变量：

python

from google import genai
client = genai.Client()

或者，也可以在创建客户端时硬编码参数：

python

from google import genai
client = genai.Client(vertexai=True, project="your-project-id", location="global")

Models

模型选择

Use
```
gemini-3.1-pro-preview
```
for complex reasoning, coding, research (1M tokens)
Use
```
gemini-3-flash-preview
```
for fast, balanced performance, multimodal (1M tokens)
Use
```
gemini-3-pro-image-preview
```
for Nano Banana Pro image generation and editing
Use
```
gemini-live-2.5-flash-native-audio
```
for Live Realtime API including native audio

Use the following models if explicitly requested:

Use
```
gemini-2.5-flash-image
```
for Nano Banana image generation and editing
Use
```
gemini-2.5-flash
```
Use
```
gemini-2.5-flash-lite
```
Use
```
gemini-2.5-pro
```

[!IMPORTANT] Models like
gemini-2.0-*
,
gemini-1.5-*
,
gemini-1.0-*
,
gemini-pro
are legacy and deprecated. Use the new models above. Your knowledge is outdated. For production environments, consult the Vertex AI documentation for stable model versions (e.g.
gemini-3-flash
).

复杂推理、编码、研究场景使用
```
gemini-3.1-pro-preview
```
（支持1M tokens）
快速、平衡性能的多模态场景使用
```
gemini-3-flash-preview
```
（支持1M tokens）
Nano Banana Pro图像生成与编辑场景使用
```
gemini-3-pro-image-preview
```
包含原生音频的实时Live API场景使用
```
gemini-live-2.5-flash-native-audio
```

如果用户明确要求，可使用以下模型：

Nano Banana图像生成与编辑场景使用
```
gemini-2.5-flash-image
```
使用
```
gemini-2.5-flash
```
使用
```
gemini-2.5-flash-lite
```
使用
```
gemini-2.5-pro
```

[!IMPORTANT] 如
gemini-2.0-*
、
gemini-1.5-*
、
gemini-1.0-*
、
gemini-pro
等模型均为旧版且已被弃用，请使用上述新版模型。旧版知识已过时。对于生产环境，请参考Vertex AI文档获取稳定版本的模型（如
gemini-3-flash
）。

Quick Start

快速开始

Python

python

from google import genai
client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Explain quantum computing"
)
print(response.text)

python

from google import genai
client = genai.Client()
response = client.models.generate_content(
    model="gemini-3-flash-preview",
    contents="Explain quantum computing"
)
print(response.text)

TypeScript/JavaScript

typescript

import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ vertexai: { project: "your-project-id", location: "global" } });
const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: "Explain quantum computing"
});
console.log(response.text);

typescript

import { GoogleGenAI } from "@google/genai";
const ai = new GoogleGenAI({ vertexai: { project: "your-project-id", location: "global" } });
const response = await ai.models.generateContent({
    model: "gemini-3-flash-preview",
    contents: "Explain quantum computing"
});
console.log(response.text);

Go

package main

import (
	"context"
	"fmt"
	"log"
	"google.golang.org/genai"
)

func main() {
	ctx := context.Background()
	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		Backend:  genai.BackendVertexAI,
		Project:  "your-project-id",
		Location: "global",
	})
	if err != nil {
		log.Fatal(err)
	}

	resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(resp.Text)
}

package main

import (
	"context"
	"fmt"
	"log"
	"google.golang.org/genai"
)

func main() {
	ctx := context.Background()
	client, err := genai.NewClient(ctx, &genai.ClientConfig{
		Backend:  genai.BackendVertexAI,
		Project:  "your-project-id",
		Location: "global",
	})
	if err != nil {
		log.Fatal(err)
	}

	resp, err := client.Models.GenerateContent(ctx, "gemini-3-flash-preview", genai.Text("Explain quantum computing"), nil)
	if err != nil {
		log.Fatal(err)
	}

	fmt.Println(resp.Text)
}

Java

java

import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;

public class GenerateTextFromTextInput {
  public static void main(String[] args) {
    Client client = Client.builder().vertexAi(true).project("your-project-id").location("global").build();
    GenerateContentResponse response =
        client.models.generateContent(
            "gemini-3-flash-preview",
            "Explain quantum computing",
            null);

    System.out.println(response.text());
  }
}

java

import com.google.genai.Client;
import com.google.genai.types.GenerateContentResponse;

public class GenerateTextFromTextInput {
  public static void main(String[] args) {
    Client client = Client.builder().vertexAi(true).project("your-project-id").location("global").build();
    GenerateContentResponse response =
        client.models.generateContent(
            "gemini-3-flash-preview",
            "Explain quantum computing",
            null);

    System.out.println(response.text());
  }
}

C#/.NET

csharp

using Google.GenAI;

var client = new Client(
    project: "your-project-id",
    location: "global",
    vertexAI: true
);

var response = await client.Models.GenerateContent(
    "gemini-3-flash-preview",
    "Explain quantum computing"
);

Console.WriteLine(response.Text);

csharp

using Google.GenAI;

var client = new Client(
    project: "your-project-id",
    location: "global",
    vertexAI: true
);

var response = await client.Models.GenerateContent(
    "gemini-3-flash-preview",
    "Explain quantum computing"
);

Console.WriteLine(response.Text);

API spec & Documentation (source of truth)

API规范与文档（权威来源）

When implementing or debugging API integration for Vertex AI, refer to the official Google Cloud Vertex AI documentation:

Vertex AI Gemini Documentation: https://cloud.google.com/vertex-ai/generative-ai/docs/
REST API Reference: https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest

The Gen AI SDK on Vertex AI uses the

v1beta1

v1

REST API endpoints (e.g.,

https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:generateContent

[!TIP] Use the Developer Knowledge MCP Server: If the
search_documents
or
get_document
tools are available, use them to find and retrieve official documentation for Google Cloud and Vertex AI directly within the context. This is the preferred method for getting up-to-date API details and code snippets.

在实现或调试Vertex AI的API集成时，请参考官方Google Cloud Vertex AI文档：

Vertex AI Gemini文档：https://cloud.google.com/vertex-ai/generative-ai/docs/
REST API参考：https://cloud.google.com/vertex-ai/generative-ai/docs/reference/rest

Vertex AI上的Gen AI SDK使用

v1beta1

或

v1

版本的REST API端点（例如：

https://{LOCATION}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT}/locations/{LOCATION}/publishers/google/models/{MODEL}:generateContent

）。

[!TIP] 使用开发者知识MCP服务器：如果
search_documents
或
get_document
工具可用，请使用它们直接在上下文内查找和检索Google Cloud及Vertex AI的官方文档。这是获取最新API细节和代码片段的首选方式。

Workflows and Code Samples

工作流与代码示例

Reference the Python Docs Samples repository for additional code samples and specific usage scenarios.

Depending on the specific user request, refer to the following reference files for detailed code samples and usage patterns (Python examples):

Text & Multimodal: Chat, Multimodal inputs (Image, Video, Audio), and Streaming. See references/text_and_multimodal.md
Embeddings: Generate text embeddings for semantic search. See references/embeddings.md
Structured Output & Tools: JSON generation, Function Calling, Search Grounding, and Code Execution. See references/structured_and_tools.md
Media Generation: Image generation, Image editing, and Video generation. See references/media_generation.md
Bounding Box Detection: Object detection and localization within images and video. See references/bounding_box.md
Live API: Real-time bidirectional streaming for voice, vision, and text. See references/live_api.md
Advanced Features: Content Caching, Batch Prediction, and Thinking/Reasoning. See references/advanced_features.md
Safety: Adjusting Responsible AI filters and thresholds. See references/safety.md
Model Tuning: Supervised Fine-Tuning and Preference Tuning. See references/model_tuning.md

可参考Python文档示例仓库获取更多代码示例和特定场景的使用方法。

根据用户的具体请求，可参考以下参考文件获取详细代码示例和使用模式（以Python为例）：

文本与多模态：对话、多模态输入（图像、视频、音频）和流式处理。请查看references/text_and_multimodal.md
嵌入向量：生成文本嵌入向量用于语义搜索。请查看references/embeddings.md
结构化输出与工具：JSON生成、函数调用、搜索Grounding和代码执行。请查看references/structured_and_tools.md
媒体生成：图像生成、图像编辑和视频生成。请查看references/media_generation.md
边界框检测：图像和视频中的目标检测与定位。请查看references/bounding_box.md
Live API：语音、视觉和文本的实时双向流处理。请查看references/live_api.md
高级功能：内容缓存、批量预测和思维推理。请查看references/advanced_features.md
安全设置：调整负责任AI过滤器和阈值。请查看references/safety.md
模型调优：监督式微调与偏好调优。请查看references/model_tuning.md