portkey-python-sdk
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChinesePortkey Python SDK
Portkey Python SDK
The Portkey Python SDK provides a unified interface to 200+ LLMs through the Portkey AI Gateway. Built on top of the OpenAI SDK for seamless compatibility, it adds production-grade features: automatic fallbacks, retries, load balancing, semantic caching, guardrails, and comprehensive observability.
Additional References:
- API Reference - Response structures, error handling
- Advanced Features - Tool calling, embeddings, audio, images
- Framework Integrations - LangChain, LlamaIndex, Strands, Google ADK
- Provider Configuration - Azure, AWS Bedrock, Vertex AI setup
Portkey Python SDK 通过 Portkey AI Gateway 提供统一接口,可访问200+大语言模型(LLM)。它基于 OpenAI SDK 构建,具备无缝兼容性,同时新增了生产级功能:自动降级、重试、负载均衡、语义缓存、安全防护与全面的可观测能力。
更多参考文档:
- API 参考文档 - 响应结构、错误处理
- 进阶功能文档 - 工具调用、嵌入、音频、图像能力
- 框架集成文档 - LangChain、LlamaIndex、Strands、Google ADK 集成
- 服务商配置文档 - Azure、AWS Bedrock、Vertex AI 配置指南
Installation
安装
bash
pip install portkey-aibash
pip install portkey-aiOr with poetry/uv
Or with poetry/uv
poetry add portkey-ai
uv add portkey-ai
---poetry add portkey-ai
uv add portkey-ai
---Quick Start
快速开始
python
import os
from portkey_ai import Portkey
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="your-openai-virtual-key"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)python
import os
from portkey_ai import Portkey
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="your-openai-virtual-key"
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)Authentication
身份验证
API Key + Virtual Key (Recommended)
API Key + 虚拟密钥(推荐方案)
Virtual keys securely store provider API keys in Portkey's vault:
python
import os
from portkey_ai import Portkey
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"], # From app.portkey.ai
virtual_key="openai-virtual-key-xxx" # From app.portkey.ai/virtual-keys
)虚拟密钥可在 Portkey 密钥库中安全存储服务商API密钥:
python
import os
from portkey_ai import Portkey
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"], # 从 app.portkey.ai 获取
virtual_key="openai-virtual-key-xxx" # 从 app.portkey.ai/virtual-keys 获取
)Using Config IDs
使用配置ID
Pre-configure routing, fallbacks, and caching in the dashboard:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config="pc-config-xxx" # Config ID from dashboard
)可在控制台预配置路由、降级与缓存策略:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config="pc-config-xxx" # 从控制台获取配置ID
)Chat Completions
对话补全
Basic Request
基础请求
python
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing briefly."}
]
)
print(response.choices[0].message.content)
print(f"Tokens used: {response.usage.total_tokens}")python
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Explain quantum computing briefly."}
]
)
print(response.choices[0].message.content)
print(f"Tokens used: {response.usage.total_tokens}")Streaming
流式响应
python
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a short story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)python
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a short story"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Async Support
异步支持
python
import asyncio
from portkey_ai import AsyncPortkey
async def main():
client = AsyncPortkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key"
)
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
asyncio.run(main())python
import asyncio
from portkey_ai import AsyncPortkey
async def main():
client = AsyncPortkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key"
)
response = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}]
)
print(response.choices[0].message.content)
asyncio.run(main())Async Streaming
异步流式响应
python
async def stream_response():
client = AsyncPortkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key"
)
stream = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True
)
async for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)python
async def stream_response():
client = AsyncPortkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key"
)
stream = await client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a poem"}],
stream=True
)
async for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Gateway Features
网关核心功能
Fallbacks
自动降级
Automatic failover when a provider fails:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"strategy": {"mode": "fallback"},
"targets": [
{
"virtual_key": "openai-key",
"override_params": {"model": "gpt-4o"}
},
{
"virtual_key": "anthropic-key",
"override_params": {"model": "claude-3-5-sonnet-20241022"}
}
]
}
)当服务商服务异常时自动切换至备用服务商:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"strategy": {"mode": "fallback"},
"targets": [
{
"virtual_key": "openai-key",
"override_params": {"model": "gpt-4o"}
},
{
"virtual_key": "anthropic-key",
"override_params": {"model": "claude-3-5-sonnet-20241022"}
}
]
}
)If OpenAI fails, automatically tries Anthropic
若OpenAI服务异常,将自动尝试调用Anthropic
response = client.chat.completions.create(
messages=[{"role": "user", "content": "Hello!"}]
)
undefinedresponse = client.chat.completions.create(
messages=[{"role": "user", "content": "Hello!"}]
)
undefinedLoad Balancing
负载均衡
Distribute traffic across providers:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"strategy": {"mode": "loadbalance"},
"targets": [
{"virtual_key": "openai-key-1", "weight": 0.7},
{"virtual_key": "openai-key-2", "weight": 0.3}
]
}
)在多个服务商间分配请求流量:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"strategy": {"mode": "loadbalance"},
"targets": [
{"virtual_key": "openai-key-1", "weight": 0.7},
{"virtual_key": "openai-key-2", "weight": 0.3}
]
}
)Automatic Retries
自动重试
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"retry": {
"attempts": 3,
"on_status_codes": [429, 500, 502, 503, 504]
},
"virtual_key": "openai-key"
}
)python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"retry": {
"attempts": 3,
"on_status_codes": [429, 500, 502, 503, 504]
},
"virtual_key": "openai-key"
}
)Semantic Caching
语义缓存
Reduce costs and latency with intelligent caching:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"cache": {
"mode": "semantic", # or "simple" for exact match
"max_age": 3600 # TTL in seconds
},
"virtual_key": "openai-key"
}
)通过智能缓存降低成本与响应延迟:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"cache": {
"mode": "semantic", # 或使用"simple"实现精确匹配缓存
"max_age": 3600 # 缓存过期时间(秒)
},
"virtual_key": "openai-key"
}
)Similar queries return cached responses
相似请求将返回缓存响应
response1 = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is the capital of France?"}]
)
response2 = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Tell me France's capital"}]
) # Returns cached response
undefinedresponse1 = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "What is the capital of France?"}]
)
response2 = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Tell me France's capital"}]
) # 返回缓存响应
undefinedRequest Timeout
请求超时
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key",
request_timeout=30 # 30 seconds
)python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key",
request_timeout=30 # 30秒
)Observability
可观测性
Trace IDs
追踪ID
Link related requests for debugging:
python
import uuid
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key",
trace_id=str(uuid.uuid4())
)通过追踪ID关联相关请求,便于调试:
python
import uuid
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key",
trace_id=str(uuid.uuid4())
)Custom Metadata
自定义元数据
Add searchable metadata to requests:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key",
metadata={
"user_id": "user-123",
"session_id": "session-456",
"environment": "production"
}
)为请求添加可搜索的元数据:
python
client = Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
virtual_key="openai-key",
metadata={
"user_id": "user-123",
"session_id": "session-456",
"environment": "production"
}
)Per-Request Options
单请求配置
python
response = client.with_options(
trace_id="unique-trace-id",
metadata={"request_type": "summarization"}
).chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarize this..."}]
)python
response = client.with_options(
trace_id="unique-trace-id",
metadata={"request_type": "summarization"}
).chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Summarize this..."}]
)Common Patterns
常见使用模式
Multi-turn Conversation
多轮对话
python
messages = [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "What is Python?"},
{"role": "assistant", "content": "Python is a high-level programming language..."},
{"role": "user", "content": "Show me a hello world example."}
]
response = client.chat.completions.create(model="gpt-4o", messages=messages)python
messages = [
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "What is Python?"},
{"role": "assistant", "content": "Python is a high-level programming language..."},
{"role": "user", "content": "Show me a hello world example."}
]
response = client.chat.completions.create(model="gpt-4o", messages=messages)JSON Output
JSON 格式输出
python
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Extract as JSON with name and age fields."},
{"role": "user", "content": "John is 30 years old."}
],
response_format={"type": "json_object"}
)python
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": "Extract as JSON with name and age fields."},
{"role": "user", "content": "John is 30 years old."}
],
response_format={"type": "json_object"}
)Returns: {"name": "John", "age": 30}
返回结果: {"name": "John", "age": 30}
undefinedundefinedProduction Setup with Fallbacks + Caching
生产环境配置(含降级与缓存)
python
def create_production_client():
return Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"strategy": {"mode": "fallback"},
"targets": [
{
"virtual_key": os.environ["OPENAI_VIRTUAL_KEY"],
"override_params": {"model": "gpt-4o"},
"retry": {"attempts": 2, "on_status_codes": [429, 500]}
},
{
"virtual_key": os.environ["ANTHROPIC_VIRTUAL_KEY"],
"override_params": {"model": "claude-3-5-sonnet-20241022"}
}
],
"cache": {"mode": "semantic", "max_age": 3600}
},
trace_id="production-session",
metadata={"environment": "production"}
)python
def create_production_client():
return Portkey(
api_key=os.environ["PORTKEY_API_KEY"],
config={
"strategy": {"mode": "fallback"},
"targets": [
{
"virtual_key": os.environ["OPENAI_VIRTUAL_KEY"],
"override_params": {"model": "gpt-4o"},
"retry": {"attempts": 2, "on_status_codes": [429, 500]}
},
{
"virtual_key": os.environ["ANTHROPIC_VIRTUAL_KEY"],
"override_params": {"model": "claude-3-5-sonnet-20241022"}
}
],
"cache": {"mode": "semantic", "max_age": 3600}
},
trace_id="production-session",
metadata={"environment": "production"}
)Best Practices
最佳实践
- Use environment variables - Never hardcode API keys
- Implement fallbacks - Always have backup providers for production
- Use streaming - Better UX for long responses
- Add tracing - Enable observability with trace IDs and metadata
- Enable caching - Reduce costs with semantic caching
- Handle errors - Implement retry logic with exponential backoff
- 使用环境变量 - 切勿硬编码API密钥
- 配置自动降级 - 生产环境务必配置备用服务商
- 优先使用流式响应 - 长响应场景下提升用户体验
- 添加追踪能力 - 通过追踪ID与元数据实现可观测性
- 启用缓存 - 借助语义缓存降低成本
- 错误处理 - 实现带指数退避的重试逻辑
Resources
相关资源
- Dashboard: app.portkey.ai
- Documentation: docs.portkey.ai
- GitHub: github.com/portkey-ai/portkey-python-sdk
- Discord: portkey.ai/discord
- 控制台: app.portkey.ai
- 官方文档: docs.portkey.ai
- GitHub: github.com/portkey-ai/portkey-python-sdk
- Discord社区: portkey.ai/discord