fal-vision

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

fal-vision

Analyze and understand images using fal.ai vision models — segmentation, detection, OCR, captioning, and visual QA.

使用fal.ai视觉模型分析和理解图片——包括分割、检测、OCR、图像字幕生成以及视觉问答（Visual QA）。

Scripts

脚本

Script	Purpose
`analyze.sh`	Analyze an image (segment, detect, OCR, describe, QA)

脚本	用途
`analyze.sh`	分析图片（分割、检测、OCR、描述、问答）

Usage

使用方法

Segment Objects

对象分割

bash

./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation segment --query "the red car"

bash

./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation segment --query "the red car"

Detect Objects

对象检测

bash

./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation detect

bash

./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation detect

Extract Text (OCR)

文本提取（OCR）

bash

./scripts/analyze.sh --image-url "https://example.com/document.jpg" --operation ocr

bash

./scripts/analyze.sh --image-url "https://example.com/document.jpg" --operation ocr

Describe Image

图片描述

bash

./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation describe

bash

./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation describe

Visual QA

视觉问答

bash

./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation qa --query "How many people are in this image?"

bash

./scripts/analyze.sh --image-url "https://example.com/photo.jpg" --operation qa --query "How many people are in this image?"

Arguments

参数

Argument	Description	Required
`--image-url`	URL of image to analyze	Yes
`--operation`	segment, detect, ocr, describe, qa	Yes
`--query` / `-q`	Text prompt for segment/qa operations	For segment/qa
`--model` / `-m`	Override model endpoint	No

参数	描述	是否必填
`--image-url`	待分析图片的URL	是
`--operation`	操作类型：segment、detect、ocr、describe、qa	是
`--query` / `-q`	分割/问答操作的文本提示词	仅分割/问答操作需要
`--model` / `-m`	覆盖默认模型端点	否

Finding Models

查找模型

To discover the best and latest vision/analysis models, use the search API:

bash

undefined

要发现最佳且最新的视觉/分析模型，可使用搜索API：

bash

undefined

Search for segmentation models

搜索分割模型

bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "segmentation"

Search for object detection models

搜索对象检测模型

bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "object detection"

Search for OCR models

搜索OCR模型

bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "ocr"

Search for image captioning / visual QA models

搜索图像字幕生成/视觉问答模型

bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "caption" bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "visual question"


Or use the `search_models` MCP tool with keywords like "segmentation", "detection", "ocr", "caption", "vision".

bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "caption" bash /mnt/skills/user/fal-generate/scripts/search-models.sh --query "visual question"


或者使用`search_models` MCP工具，关键词包括「segmentation」「detection」「ocr」「caption」「vision」等。