fal-ai-media

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

fal.ai Media Generation

fal.ai 媒体生成

Generate images, videos, and audio using fal.ai models via MCP.

通过MCP使用fal.ai模型生成图片、视频和音频。

When to Activate

激活场景

User wants to generate images from text prompts
Creating videos from text or images
Generating speech, music, or sound effects
Any media generation task
User says "generate image", "create video", "text to speech", "make a thumbnail", or similar

用户希望通过文本提示词生成图片
从文本或图片创建视频
生成语音、音乐或音效
任何媒体生成任务
用户提及“生成图片”“创建视频”“文本转语音”“制作缩略图”或类似需求

MCP Requirement

MCP配置要求

fal.ai MCP server must be configured. Add to

~/.claude.json

json

"fal-ai": {
  "command": "npx",
  "args": ["-y", "fal-ai-mcp-server"],
  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }
}

Get an API key at fal.ai.

必须配置fal.ai MCP服务器。在

~/.claude.json

中添加以下内容：

json

"fal-ai": {
  "command": "npx",
  "args": ["-y", "fal-ai-mcp-server"],
  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }
}

在fal.ai获取API密钥。

MCP Tools

MCP工具集

The fal.ai MCP provides these tools:

```
search
```
— Find available models by keyword
```
find
```
— Get model details and parameters
```
generate
```
— Run a model with parameters
```
result
```
— Check async generation status
```
status
```
— Check job status
```
cancel
```
— Cancel a running job
```
estimate_cost
```
— Estimate generation cost
```
models
```
— List popular models
```
upload
```
— Upload files for use as inputs

fal.ai MCP提供以下工具：

```
search
```
— 通过关键词查找可用模型
```
find
```
— 获取模型详情和参数
```
generate
```
— 传入参数运行模型
```
result
```
— 检查异步生成任务状态
```
status
```
— 检查作业状态
```
cancel
```
— 取消正在运行的作业
```
estimate_cost
```
— 估算生成成本
```
models
```
— 列出热门模型
```
upload
```
— 上传文件作为输入源

Image Generation

图片生成

Nano Banana 2 (Fast)

Nano Banana 2（快速版）

Best for: quick iterations, drafts, text-to-image, image editing.

generate(
  app_id: "fal-ai/nano-banana-2",
  input_data: {
    "prompt": "a futuristic cityscape at sunset, cyberpunk style",
    "image_size": "landscape_16_9",
    "num_images": 1,
    "seed": 42
  }
)

适用场景：快速迭代、草稿生成、文本转图片、图片编辑。

generate(
  app_id: "fal-ai/nano-banana-2",
  input_data: {
    "prompt": "a futuristic cityscape at sunset, cyberpunk style",
    "image_size": "landscape_16_9",
    "num_images": 1,
    "seed": 42
  }
)

Nano Banana Pro (High Fidelity)

Nano Banana Pro（高保真版）

Best for: production images, realism, typography, detailed prompts.

generate(
  app_id: "fal-ai/nano-banana-pro",
  input_data: {
    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",
    "image_size": "square",
    "num_images": 1,
    "guidance_scale": 7.5
  }
)

适用场景：生产级图片、写实风格、排版设计、细节丰富的提示词。

generate(
  app_id: "fal-ai/nano-banana-pro",
  input_data: {
    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",
    "image_size": "square",
    "num_images": 1,
    "guidance_scale": 7.5
  }
)

Common Image Parameters

图片生成通用参数

Param	Type	Options	Notes
`prompt`	string	required	Describe what you want
`image_size`	string	`square` , `portrait_4_3` , `landscape_16_9` , `portrait_16_9` , `landscape_4_3`	Aspect ratio
`num_images`	number	1-4	How many to generate
`seed`	number	any integer	Reproducibility
`guidance_scale`	number	1-20	How closely to follow the prompt (higher = more literal)

参数	类型	可选值	说明
`prompt`	字符串	必填	描述你想要的内容
`image_size`	字符串	`square` , `portrait_4_3` , `landscape_16_9` , `portrait_16_9` , `landscape_4_3`	画面比例
`num_images`	数字	1-4	生成图片数量
`seed`	数字	任意整数	结果可复现性
`guidance_scale`	数字	1-20	提示词遵循度（值越高越贴合提示词）

Image Editing

图片编辑

Use Nano Banana 2 with an input image for inpainting, outpainting, or style transfer:

undefined

使用Nano Banana 2结合输入图片进行修复、扩展或风格迁移：

undefined

First upload the source image

首先上传源图片

upload(file_path: "/path/to/image.png")

Then generate with image input

然后传入图片地址生成

generate( app_id: "fal-ai/nano-banana-2", input_data: { "prompt": "same scene but in watercolor style", "image_url": "<uploaded_url>", "image_size": "landscape_16_9" } )

---

generate( app_id: "fal-ai/nano-banana-2", input_data: { "prompt": "same scene but in watercolor style", "image_url": "<uploaded_url>", "image_size": "landscape_16_9" } )

---

Video Generation

视频生成

Seedance 1.0 Pro (ByteDance)

Seedance 1.0 Pro（字节跳动）

Best for: text-to-video, image-to-video with high motion quality.

generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",
    "duration": "5s",
    "aspect_ratio": "16:9",
    "seed": 42
  }
)

适用场景：文本转视频、图片转视频，运动画质出色。

generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",
    "duration": "5s",
    "aspect_ratio": "16:9",
    "seed": 42
  }
)

Kling Video v3 Pro

Best for: text/image-to-video with native audio generation.

generate(
  app_id: "fal-ai/kling-video/v3/pro",
  input_data: {
    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",
    "duration": "5s",
    "aspect_ratio": "16:9"
  }
)

适用场景：文本/图片转视频，支持原生音频生成。

generate(
  app_id: "fal-ai/kling-video/v3/pro",
  input_data: {
    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",
    "duration": "5s",
    "aspect_ratio": "16:9"
  }
)

Veo 3 (Google DeepMind)

Veo 3（Google DeepMind）

Best for: video with generated sound, high visual quality.

generate(
  app_id: "fal-ai/veo-3",
  input_data: {
    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",
    "aspect_ratio": "16:9"
  }
)

适用场景：带生成音效的视频，视觉画质出色。

generate(
  app_id: "fal-ai/veo-3",
  input_data: {
    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",
    "aspect_ratio": "16:9"
  }
)

Image-to-Video

图片转视频

Start from an existing image:

generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "camera slowly zooms out, gentle wind moves the trees",
    "image_url": "<uploaded_image_url>",
    "duration": "5s"
  }
)

基于现有图片生成视频：

generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "camera slowly zooms out, gentle wind moves the trees",
    "image_url": "<uploaded_image_url>",
    "duration": "5s"
  }
)

Video Parameters

视频生成参数

Param	Type	Options	Notes
`prompt`	string	required	Describe the video
`duration`	string	`"5s"` , `"10s"`	Video length
`aspect_ratio`	string	`"16:9"` , `"9:16"` , `"1:1"`	Frame ratio
`seed`	number	any integer	Reproducibility
`image_url`	string	URL	Source image for image-to-video

参数	类型	可选值	说明
`prompt`	字符串	必填	描述视频内容
`duration`	字符串	`"5s"` , `"10s"`	视频时长
`aspect_ratio`	字符串	`"16:9"` , `"9:16"` , `"1:1"`	画面比例
`seed`	数字	任意整数	结果可复现性
`image_url`	字符串	URL	图片转视频的源图片地址

Audio Generation

音频生成

CSM-1B (Conversational Speech)

CSM-1B（对话式语音）

Text-to-speech with natural, conversational quality.

generate(
  app_id: "fal-ai/csm-1b",
  input_data: {
    "text": "Hello, welcome to the demo. Let me show you how this works.",
    "speaker_id": 0
  }
)

生成自然、口语化的文本转语音内容。

generate(
  app_id: "fal-ai/csm-1b",
  input_data: {
    "text": "Hello, welcome to the demo. Let me show you how this works.",
    "speaker_id": 0
  }
)

ThinkSound (Video-to-Audio)

ThinkSound（视频转音频）

Generate matching audio from video content.

generate(
  app_id: "fal-ai/thinksound",
  input_data: {
    "video_url": "<video_url>",
    "prompt": "ambient forest sounds with birds chirping"
  }
)

根据视频内容生成匹配的音频。

generate(
  app_id: "fal-ai/thinksound",
  input_data: {
    "video_url": "<video_url>",
    "prompt": "ambient forest sounds with birds chirping"
  }
)

ElevenLabs (via API, no MCP)

ElevenLabs（通过API，无需MCP）

For professional voice synthesis, use ElevenLabs directly:

python

import os
import requests

resp = requests.post(
    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",
    headers={
        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
        "Content-Type": "application/json"
    },
    json={
        "text": "Your text here",
        "model_id": "eleven_turbo_v2_5",
        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
    }
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)

如需专业语音合成，可直接使用ElevenLabs：

python

import os
import requests

resp = requests.post(
    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",
    headers={
        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
        "Content-Type": "application/json"
    },
    json={
        "text": "Your text here",
        "model_id": "eleven_turbo_v2_5",
        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
    }
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)

VideoDB Generative Audio

VideoDB 生成式音频

If VideoDB is configured, use its generative audio:

python

undefined

若已配置VideoDB，可使用其生成式音频功能：

python

undefined

Voice generation

语音生成

audio = coll.generate_voice(text="Your narration here", voice="alloy")

Music generation

音乐生成

music = coll.generate_music(prompt="upbeat electronic background music", duration=30)

Sound effects

音效生成

sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

---

sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

---

Cost Estimation

成本估算

Before generating, check estimated cost:

estimate_cost(
  estimate_type: "unit_price",
  endpoints: {
    "fal-ai/nano-banana-pro": {
      "unit_quantity": 1
    }
  }
)

生成前可估算成本：

estimate_cost(
  estimate_type: "unit_price",
  endpoints: {
    "fal-ai/nano-banana-pro": {
      "unit_quantity": 1
    }
  }
)

Model Discovery

模型发现

Find models for specific tasks:

search(query: "text to video")
find(endpoint_ids: ["fal-ai/seedance-1-0-pro"])
models()

查找特定任务的模型：

search(query: "text to video")
find(endpoint_ids: ["fal-ai/seedance-1-0-pro"])
models()

Tips

使用技巧

Use
```
seed
```
for reproducible results when iterating on prompts
Start with lower-cost models (Nano Banana 2) for prompt iteration, then switch to Pro for finals
For video, keep prompts descriptive but concise — focus on motion and scene
Image-to-video produces more controlled results than pure text-to-video
Check
```
estimate_cost
```
before running expensive video generations

迭代提示词时使用
```
seed
```
参数确保结果可复现
先用低成本模型（Nano Banana 2）迭代提示词，最终版本再切换到Pro版
视频提示词要描述性强但简洁——重点关注运动和场景
图片转视频比纯文本转视频的结果更可控
运行高成本视频生成前先调用
```
estimate_cost
```
估算费用

fal-ai-media

Original

Translation

fal.ai Media Generation

fal.ai 媒体生成

When to Activate

激活场景

MCP Requirement

MCP配置要求

MCP Tools

MCP工具集

Image Generation

图片生成

Nano Banana 2 (Fast)

Nano Banana 2（快速版）

Nano Banana Pro (High Fidelity)

Nano Banana Pro（高保真版）

Common Image Parameters

图片生成通用参数

Image Editing

图片编辑

First upload the source image

首先上传源图片

Then generate with image input

然后传入图片地址生成

Video Generation

视频生成

Seedance 1.0 Pro (ByteDance)

Seedance 1.0 Pro（字节跳动）

Kling Video v3 Pro

Kling Video v3 Pro

Veo 3 (Google DeepMind)

Veo 3（Google DeepMind）

Image-to-Video

图片转视频

Video Parameters

视频生成参数

Audio Generation

音频生成

CSM-1B (Conversational Speech)

CSM-1B（对话式语音）

ThinkSound (Video-to-Audio)

ThinkSound（视频转音频）

ElevenLabs (via API, no MCP)

ElevenLabs（通过API，无需MCP）

VideoDB Generative Audio

VideoDB 生成式音频

Voice generation

语音生成

Music generation

音乐生成

Sound effects

音效生成

Cost Estimation

成本估算

Model Discovery

模型发现

Tips

使用技巧

Related Skills

相关技能