fal-ai-media

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

fal.ai Media Generation

fal.ai 媒体生成

Generate images, videos, and audio using fal.ai models via MCP.
通过MCP使用fal.ai模型生成图片、视频和音频。

When to Activate

激活场景

  • User wants to generate images from text prompts
  • Creating videos from text or images
  • Generating speech, music, or sound effects
  • Any media generation task
  • User says "generate image", "create video", "text to speech", "make a thumbnail", or similar
  • 用户希望通过文本提示词生成图片
  • 从文本或图片创建视频
  • 生成语音、音乐或音效
  • 任何媒体生成任务
  • 用户提及“生成图片”“创建视频”“文本转语音”“制作缩略图”或类似需求

MCP Requirement

MCP配置要求

fal.ai MCP server must be configured. Add to
~/.claude.json
:
json
"fal-ai": {
  "command": "npx",
  "args": ["-y", "fal-ai-mcp-server"],
  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }
}
Get an API key at fal.ai.
必须配置fal.ai MCP服务器。在
~/.claude.json
中添加以下内容:
json
"fal-ai": {
  "command": "npx",
  "args": ["-y", "fal-ai-mcp-server"],
  "env": { "FAL_KEY": "YOUR_FAL_KEY_HERE" }
}
fal.ai获取API密钥。

MCP Tools

MCP工具集

The fal.ai MCP provides these tools:
  • search
    — Find available models by keyword
  • find
    — Get model details and parameters
  • generate
    — Run a model with parameters
  • result
    — Check async generation status
  • status
    — Check job status
  • cancel
    — Cancel a running job
  • estimate_cost
    — Estimate generation cost
  • models
    — List popular models
  • upload
    — Upload files for use as inputs

fal.ai MCP提供以下工具:
  • search
    — 通过关键词查找可用模型
  • find
    — 获取模型详情和参数
  • generate
    — 传入参数运行模型
  • result
    — 检查异步生成任务状态
  • status
    — 检查作业状态
  • cancel
    — 取消正在运行的作业
  • estimate_cost
    — 估算生成成本
  • models
    — 列出热门模型
  • upload
    — 上传文件作为输入源

Image Generation

图片生成

Nano Banana 2 (Fast)

Nano Banana 2(快速版)

Best for: quick iterations, drafts, text-to-image, image editing.
generate(
  app_id: "fal-ai/nano-banana-2",
  input_data: {
    "prompt": "a futuristic cityscape at sunset, cyberpunk style",
    "image_size": "landscape_16_9",
    "num_images": 1,
    "seed": 42
  }
)
适用场景:快速迭代、草稿生成、文本转图片、图片编辑。
generate(
  app_id: "fal-ai/nano-banana-2",
  input_data: {
    "prompt": "a futuristic cityscape at sunset, cyberpunk style",
    "image_size": "landscape_16_9",
    "num_images": 1,
    "seed": 42
  }
)

Nano Banana Pro (High Fidelity)

Nano Banana Pro(高保真版)

Best for: production images, realism, typography, detailed prompts.
generate(
  app_id: "fal-ai/nano-banana-pro",
  input_data: {
    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",
    "image_size": "square",
    "num_images": 1,
    "guidance_scale": 7.5
  }
)
适用场景:生产级图片、写实风格、排版设计、细节丰富的提示词。
generate(
  app_id: "fal-ai/nano-banana-pro",
  input_data: {
    "prompt": "professional product photo of wireless headphones on marble surface, studio lighting",
    "image_size": "square",
    "num_images": 1,
    "guidance_scale": 7.5
  }
)

Common Image Parameters

图片生成通用参数

ParamTypeOptionsNotes
prompt
stringrequiredDescribe what you want
image_size
string
square
,
portrait_4_3
,
landscape_16_9
,
portrait_16_9
,
landscape_4_3
Aspect ratio
num_images
number1-4How many to generate
seed
numberany integerReproducibility
guidance_scale
number1-20How closely to follow the prompt (higher = more literal)
参数类型可选值说明
prompt
字符串必填描述你想要的内容
image_size
字符串
square
,
portrait_4_3
,
landscape_16_9
,
portrait_16_9
,
landscape_4_3
画面比例
num_images
数字1-4生成图片数量
seed
数字任意整数结果可复现性
guidance_scale
数字1-20提示词遵循度(值越高越贴合提示词)

Image Editing

图片编辑

Use Nano Banana 2 with an input image for inpainting, outpainting, or style transfer:
undefined
使用Nano Banana 2结合输入图片进行修复、扩展或风格迁移:
undefined

First upload the source image

首先上传源图片

upload(file_path: "/path/to/image.png")
upload(file_path: "/path/to/image.png")

Then generate with image input

然后传入图片地址生成

generate( app_id: "fal-ai/nano-banana-2", input_data: { "prompt": "same scene but in watercolor style", "image_url": "<uploaded_url>", "image_size": "landscape_16_9" } )

---
generate( app_id: "fal-ai/nano-banana-2", input_data: { "prompt": "same scene but in watercolor style", "image_url": "<uploaded_url>", "image_size": "landscape_16_9" } )

---

Video Generation

视频生成

Seedance 1.0 Pro (ByteDance)

Seedance 1.0 Pro(字节跳动)

Best for: text-to-video, image-to-video with high motion quality.
generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",
    "duration": "5s",
    "aspect_ratio": "16:9",
    "seed": 42
  }
)
适用场景:文本转视频、图片转视频,运动画质出色。
generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "a drone flyover of a mountain lake at golden hour, cinematic",
    "duration": "5s",
    "aspect_ratio": "16:9",
    "seed": 42
  }
)

Kling Video v3 Pro

Kling Video v3 Pro

Best for: text/image-to-video with native audio generation.
generate(
  app_id: "fal-ai/kling-video/v3/pro",
  input_data: {
    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",
    "duration": "5s",
    "aspect_ratio": "16:9"
  }
)
适用场景:文本/图片转视频,支持原生音频生成。
generate(
  app_id: "fal-ai/kling-video/v3/pro",
  input_data: {
    "prompt": "ocean waves crashing on a rocky coast, dramatic clouds",
    "duration": "5s",
    "aspect_ratio": "16:9"
  }
)

Veo 3 (Google DeepMind)

Veo 3(Google DeepMind)

Best for: video with generated sound, high visual quality.
generate(
  app_id: "fal-ai/veo-3",
  input_data: {
    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",
    "aspect_ratio": "16:9"
  }
)
适用场景:带生成音效的视频,视觉画质出色。
generate(
  app_id: "fal-ai/veo-3",
  input_data: {
    "prompt": "a bustling Tokyo street market at night, neon signs, crowd noise",
    "aspect_ratio": "16:9"
  }
)

Image-to-Video

图片转视频

Start from an existing image:
generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "camera slowly zooms out, gentle wind moves the trees",
    "image_url": "<uploaded_image_url>",
    "duration": "5s"
  }
)
基于现有图片生成视频:
generate(
  app_id: "fal-ai/seedance-1-0-pro",
  input_data: {
    "prompt": "camera slowly zooms out, gentle wind moves the trees",
    "image_url": "<uploaded_image_url>",
    "duration": "5s"
  }
)

Video Parameters

视频生成参数

ParamTypeOptionsNotes
prompt
stringrequiredDescribe the video
duration
string
"5s"
,
"10s"
Video length
aspect_ratio
string
"16:9"
,
"9:16"
,
"1:1"
Frame ratio
seed
numberany integerReproducibility
image_url
stringURLSource image for image-to-video

参数类型可选值说明
prompt
字符串必填描述视频内容
duration
字符串
"5s"
,
"10s"
视频时长
aspect_ratio
字符串
"16:9"
,
"9:16"
,
"1:1"
画面比例
seed
数字任意整数结果可复现性
image_url
字符串URL图片转视频的源图片地址

Audio Generation

音频生成

CSM-1B (Conversational Speech)

CSM-1B(对话式语音)

Text-to-speech with natural, conversational quality.
generate(
  app_id: "fal-ai/csm-1b",
  input_data: {
    "text": "Hello, welcome to the demo. Let me show you how this works.",
    "speaker_id": 0
  }
)
生成自然、口语化的文本转语音内容。
generate(
  app_id: "fal-ai/csm-1b",
  input_data: {
    "text": "Hello, welcome to the demo. Let me show you how this works.",
    "speaker_id": 0
  }
)

ThinkSound (Video-to-Audio)

ThinkSound(视频转音频)

Generate matching audio from video content.
generate(
  app_id: "fal-ai/thinksound",
  input_data: {
    "video_url": "<video_url>",
    "prompt": "ambient forest sounds with birds chirping"
  }
)
根据视频内容生成匹配的音频。
generate(
  app_id: "fal-ai/thinksound",
  input_data: {
    "video_url": "<video_url>",
    "prompt": "ambient forest sounds with birds chirping"
  }
)

ElevenLabs (via API, no MCP)

ElevenLabs(通过API,无需MCP)

For professional voice synthesis, use ElevenLabs directly:
python
import os
import requests

resp = requests.post(
    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",
    headers={
        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
        "Content-Type": "application/json"
    },
    json={
        "text": "Your text here",
        "model_id": "eleven_turbo_v2_5",
        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
    }
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)
如需专业语音合成,可直接使用ElevenLabs:
python
import os
import requests

resp = requests.post(
    "https://api.elevenlabs.io/v1/text-to-speech/<voice_id>",
    headers={
        "xi-api-key": os.environ["ELEVENLABS_API_KEY"],
        "Content-Type": "application/json"
    },
    json={
        "text": "Your text here",
        "model_id": "eleven_turbo_v2_5",
        "voice_settings": {"stability": 0.5, "similarity_boost": 0.75}
    }
)
with open("output.mp3", "wb") as f:
    f.write(resp.content)

VideoDB Generative Audio

VideoDB 生成式音频

If VideoDB is configured, use its generative audio:
python
undefined
若已配置VideoDB,可使用其生成式音频功能:
python
undefined

Voice generation

语音生成

audio = coll.generate_voice(text="Your narration here", voice="alloy")
audio = coll.generate_voice(text="Your narration here", voice="alloy")

Music generation

音乐生成

music = coll.generate_music(prompt="upbeat electronic background music", duration=30)
music = coll.generate_music(prompt="upbeat electronic background music", duration=30)

Sound effects

音效生成

sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

---
sfx = coll.generate_sound_effect(prompt="thunder crack followed by rain")

---

Cost Estimation

成本估算

Before generating, check estimated cost:
estimate_cost(
  estimate_type: "unit_price",
  endpoints: {
    "fal-ai/nano-banana-pro": {
      "unit_quantity": 1
    }
  }
)
生成前可估算成本:
estimate_cost(
  estimate_type: "unit_price",
  endpoints: {
    "fal-ai/nano-banana-pro": {
      "unit_quantity": 1
    }
  }
)

Model Discovery

模型发现

Find models for specific tasks:
search(query: "text to video")
find(endpoint_ids: ["fal-ai/seedance-1-0-pro"])
models()
查找特定任务的模型:
search(query: "text to video")
find(endpoint_ids: ["fal-ai/seedance-1-0-pro"])
models()

Tips

使用技巧

  • Use
    seed
    for reproducible results when iterating on prompts
  • Start with lower-cost models (Nano Banana 2) for prompt iteration, then switch to Pro for finals
  • For video, keep prompts descriptive but concise — focus on motion and scene
  • Image-to-video produces more controlled results than pure text-to-video
  • Check
    estimate_cost
    before running expensive video generations
  • 迭代提示词时使用
    seed
    参数确保结果可复现
  • 先用低成本模型(Nano Banana 2)迭代提示词,最终版本再切换到Pro版
  • 视频提示词要描述性强但简洁——重点关注运动和场景
  • 图片转视频比纯文本转视频的结果更可控
  • 运行高成本视频生成前先调用
    estimate_cost
    估算费用

Related Skills

相关技能

  • videodb
    — Video processing, editing, and streaming
  • video-editing
    — AI-powered video editing workflows
  • content-engine
    — Content creation for social platforms
  • videodb
    — 视频处理、编辑和流传输
  • video-editing
    — AI驱动的视频编辑工作流
  • content-engine
    — 社交平台内容创作