azure-ai-vision-imageanalysis-py
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAzure AI Vision Image Analysis SDK for Python
Azure AI Vision 图像分析 Python SDK
Client library for Azure AI Vision 4.0 image analysis including captions, tags, objects, OCR, and more.
适用于Azure AI Vision 4.0图像分析的客户端库,支持生成图像描述、标签、对象检测、OCR等功能。
Installation
安装
bash
pip install azure-ai-vision-imageanalysisbash
pip install azure-ai-vision-imageanalysisEnvironment Variables
环境变量
bash
VISION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
VISION_KEY=<your-api-key> # If using API keybash
VISION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
VISION_KEY=<your-api-key> # If using API keyAuthentication
认证
API Key
API密钥
python
import os
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.core.credentials import AzureKeyCredential
endpoint = os.environ["VISION_ENDPOINT"]
key = os.environ["VISION_KEY"]
client = ImageAnalysisClient(
endpoint=endpoint,
credential=AzureKeyCredential(key)
)python
import os
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.core.credentials import AzureKeyCredential
endpoint = os.environ["VISION_ENDPOINT"]
key = os.environ["VISION_KEY"]
client = ImageAnalysisClient(
endpoint=endpoint,
credential=AzureKeyCredential(key)
)Entra ID (Recommended)
Entra ID(推荐)
python
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.identity import DefaultAzureCredential
client = ImageAnalysisClient(
endpoint=os.environ["VISION_ENDPOINT"],
credential=DefaultAzureCredential()
)python
from azure.ai.vision.imageanalysis import ImageAnalysisClient
from azure.identity import DefaultAzureCredential
client = ImageAnalysisClient(
endpoint=os.environ["VISION_ENDPOINT"],
credential=DefaultAzureCredential()
)Analyze Image from URL
分析URL中的图像
python
from azure.ai.vision.imageanalysis.models import VisualFeatures
image_url = "https://example.com/image.jpg"
result = client.analyze_from_url(
image_url=image_url,
visual_features=[
VisualFeatures.CAPTION,
VisualFeatures.TAGS,
VisualFeatures.OBJECTS,
VisualFeatures.READ,
VisualFeatures.PEOPLE,
VisualFeatures.SMART_CROPS,
VisualFeatures.DENSE_CAPTIONS
],
gender_neutral_caption=True,
language="en"
)python
from azure.ai.vision.imageanalysis.models import VisualFeatures
image_url = "https://example.com/image.jpg"
result = client.analyze_from_url(
image_url=image_url,
visual_features=[
VisualFeatures.CAPTION,
VisualFeatures.TAGS,
VisualFeatures.OBJECTS,
VisualFeatures.READ,
VisualFeatures.PEOPLE,
VisualFeatures.SMART_CROPS,
VisualFeatures.DENSE_CAPTIONS
],
gender_neutral_caption=True,
language="en"
)Analyze Image from File
分析本地文件中的图像
python
with open("image.jpg", "rb") as f:
image_data = f.read()
result = client.analyze(
image_data=image_data,
visual_features=[VisualFeatures.CAPTION, VisualFeatures.TAGS]
)python
with open("image.jpg", "rb") as f:
image_data = f.read()
result = client.analyze(
image_data=image_data,
visual_features=[VisualFeatures.CAPTION, VisualFeatures.TAGS]
)Image Caption
生成图像描述
python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION],
gender_neutral_caption=True
)
if result.caption:
print(f"Caption: {result.caption.text}")
print(f"Confidence: {result.caption.confidence:.2f}")python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION],
gender_neutral_caption=True
)
if result.caption:
print(f"Caption: {result.caption.text}")
print(f"Confidence: {result.caption.confidence:.2f}")Dense Captions (Multiple Regions)
生成多区域图像描述
python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.DENSE_CAPTIONS]
)
if result.dense_captions:
for caption in result.dense_captions.list:
print(f"Caption: {caption.text}")
print(f" Confidence: {caption.confidence:.2f}")
print(f" Bounding box: {caption.bounding_box}")python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.DENSE_CAPTIONS]
)
if result.dense_captions:
for caption in result.dense_captions.list:
print(f"Caption: {caption.text}")
print(f" Confidence: {caption.confidence:.2f}")
print(f" Bounding box: {caption.bounding_box}")Tags
图像标签
python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.TAGS]
)
if result.tags:
for tag in result.tags.list:
print(f"Tag: {tag.name} (confidence: {tag.confidence:.2f})")python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.TAGS]
)
if result.tags:
for tag in result.tags.list:
print(f"Tag: {tag.name} (confidence: {tag.confidence:.2f})")Object Detection
对象检测
python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.OBJECTS]
)
if result.objects:
for obj in result.objects.list:
print(f"Object: {obj.tags[0].name}")
print(f" Confidence: {obj.tags[0].confidence:.2f}")
box = obj.bounding_box
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.OBJECTS]
)
if result.objects:
for obj in result.objects.list:
print(f"Object: {obj.tags[0].name}")
print(f" Confidence: {obj.tags[0].confidence:.2f}")
box = obj.bounding_box
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")OCR (Text Extraction)
OCR(文本提取)
python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.READ]
)
if result.read:
for block in result.read.blocks:
for line in block.lines:
print(f"Line: {line.text}")
print(f" Bounding polygon: {line.bounding_polygon}")
# Word-level details
for word in line.words:
print(f" Word: {word.text} (confidence: {word.confidence:.2f})")python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.READ]
)
if result.read:
for block in result.read.blocks:
for line in block.lines:
print(f"Line: {line.text}")
print(f" Bounding polygon: {line.bounding_polygon}")
# 单词级细节
for word in line.words:
print(f" Word: {word.text} (confidence: {word.confidence:.2f})")People Detection
人物检测
python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.PEOPLE]
)
if result.people:
for person in result.people.list:
print(f"Person detected:")
print(f" Confidence: {person.confidence:.2f}")
box = person.bounding_box
print(f" Bounding box: x={box.x}, y={box.y}, w={box.width}, h={box.height}")python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.PEOPLE]
)
if result.people:
for person in result.people.list:
print(f"检测到人物:")
print(f" 置信度: {person.confidence:.2f}")
box = person.bounding_box
print(f" 边界框: x={box.x}, y={box.y}, w={box.width}, h={box.height}")Smart Cropping
智能裁剪
python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.SMART_CROPS],
smart_crops_aspect_ratios=[0.9, 1.33, 1.78] # Portrait, 4:3, 16:9
)
if result.smart_crops:
for crop in result.smart_crops.list:
print(f"Aspect ratio: {crop.aspect_ratio}")
box = crop.bounding_box
print(f" Crop region: x={box.x}, y={box.y}, w={box.width}, h={box.height}")python
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.SMART_CROPS],
smart_crops_aspect_ratios=[0.9, 1.33, 1.78] # 竖屏、4:3、16:9
)
if result.smart_crops:
for crop in result.smart_crops.list:
print(f"宽高比: {crop.aspect_ratio}")
box = crop.bounding_box
print(f" 裁剪区域: x={box.x}, y={box.y}, w={box.width}, h={box.height}")Async Client
异步客户端
python
from azure.ai.vision.imageanalysis.aio import ImageAnalysisClient
from azure.identity.aio import DefaultAzureCredential
async def analyze_image():
async with ImageAnalysisClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
) as client:
result = await client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION]
)
print(result.caption.text)python
from azure.ai.vision.imageanalysis.aio import ImageAnalysisClient
from azure.identity.aio import DefaultAzureCredential
async def analyze_image():
async with ImageAnalysisClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
) as client:
result = await client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION]
)
print(result.caption.text)Visual Features
视觉功能
| Feature | Description |
|---|---|
| Single sentence describing the image |
| Captions for multiple regions |
| Content tags (objects, scenes, actions) |
| Object detection with bounding boxes |
| OCR text extraction |
| People detection with bounding boxes |
| Suggested crop regions for thumbnails |
| 功能 | 描述 |
|---|---|
| 生成描述图像的单句文本 |
| 为图像多个区域生成描述 |
| 生成内容标签(对象、场景、动作等) |
| 检测对象并返回边界框 |
| OCR文本提取 |
| 检测人物并返回边界框 |
| 为缩略图推荐裁剪区域 |
Error Handling
错误处理
python
from azure.core.exceptions import HttpResponseError
try:
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION]
)
except HttpResponseError as e:
print(f"Status code: {e.status_code}")
print(f"Reason: {e.reason}")
print(f"Message: {e.error.message}")python
from azure.core.exceptions import HttpResponseError
try:
result = client.analyze_from_url(
image_url=image_url,
visual_features=[VisualFeatures.CAPTION]
)
except HttpResponseError as e:
print(f"状态码: {e.status_code}")
print(f"原因: {e.reason}")
print(f"消息: {e.error.message}")Image Requirements
图像要求
- Formats: JPEG, PNG, GIF, BMP, WEBP, ICO, TIFF, MPO
- Max size: 20 MB
- Dimensions: 50x50 to 16000x16000 pixels
- 格式:JPEG、PNG、GIF、BMP、WEBP、ICO、TIFF、MPO
- 最大大小:20 MB
- 尺寸:50x50 至 16000x16000 像素
Best Practices
最佳实践
- Select only needed features to optimize latency and cost
- Use async client for high-throughput scenarios
- Handle HttpResponseError for invalid images or auth issues
- Enable gender_neutral_caption for inclusive descriptions
- Specify language for localized captions
- Use smart_crops_aspect_ratios matching your thumbnail requirements
- Cache results when analyzing the same image multiple times
- 仅选择所需功能,以优化延迟和成本
- 使用异步客户端处理高吞吐量场景
- 处理HttpResponseError,应对无效图像或认证问题
- 启用gender_neutral_caption,生成包容性描述
- 指定语言,生成本地化描述
- 设置smart_crops_aspect_ratios以匹配缩略图需求
- 缓存结果,避免重复分析同一图像