video-agent

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Video Agent - AI Content Generation Suite

Video Agent - AI内容生成套件

A comprehensive AI content generation package providing a unified interface across 35+ models for image, video, and audio creation.

这是一个全面的AI内容生成包，为图像、视频和音频创作提供了统一接口，支持35+种模型。

When to Use This Skill

何时使用该Skill

Text-to-image generation
Image-to-image transformations
Text-to-video creation
Image-to-video animation
Professional text-to-speech
Multi-step content pipelines
Batch content generation

文本转图像生成
图像转图像变换
文本转视频创作
图像转视频动画
专业文本转语音
多步骤内容流水线
批量内容生成

Supported Providers

支持的提供商

FAL AI

FLUX models (text-to-image)
Image transformations
Fast inference

FLUX模型（文本转图像）
图像变换
快速推理

Google Vertex AI

Imagen 4 (text-to-image)
Veo (text-to-video)
High quality outputs

Imagen 4（文本转图像）
Veo（文本转视频）
高质量输出

ElevenLabs

20+ voice options
Professional TTS
Multiple languages

20+种语音选项
专业TTS
多语言支持

OpenRouter

Access to various LLMs
Text generation
Content writing

访问各类LLM
文本生成
内容写作

Core Capabilities

核心功能

Image Generation

图像生成

Generate image:
Prompt: "A serene Japanese garden at sunset"
Model: flux-pro
Size: 1024x1024
Style: photorealistic

Available Models:

FLUX Pro/Dev (FAL)
Imagen 4 (Google)
Stable Diffusion variants

Generate image:
Prompt: "A serene Japanese garden at sunset"
Model: flux-pro
Size: 1024x1024
Style: photorealistic

可用模型：

FLUX Pro/Dev (FAL)
Imagen 4 (Google)
Stable Diffusion变体

Video Creation

视频创作

Generate video:
Prompt: "Ocean waves crashing on rocky shore"
Model: veo
Duration: 5 seconds
Resolution: 1080p

Available Models:

Google Veo
MiniMax Hailuo
Kling

Generate video:
Prompt: "Ocean waves crashing on rocky shore"
Model: veo
Duration: 5 seconds
Resolution: 1080p

可用模型：

Google Veo
MiniMax Hailuo
Kling

Image-to-Video

图像转视频

Animate image:
Source: /path/to/image.png
Motion: "gentle zoom out with particle effects"
Duration: 4 seconds

Animate image:
Source: /path/to/image.png
Motion: "gentle zoom out with particle effects"
Duration: 4 seconds

Text-to-Speech

文本转语音

Generate audio:
Text: "Welcome to our product demo..."
Voice: professional-female-1
Speed: 1.0
Output: welcome.mp3

Voice Options:

Professional male/female
Casual conversational
Narrator styles
Multiple accents

Generate audio:
Text: "Welcome to our product demo..."
Voice: professional-female-1
Speed: 1.0
Output: welcome.mp3

语音选项：

专业男声/女声
日常对话风格
旁白风格
多种口音

Pipeline Orchestration

流水线编排

YAML Configuration

YAML配置

yaml

pipeline: product-demo
steps:
  - name: generate-logo
    type: image
    model: flux-pro
    prompt: "Modern tech logo for AI startup"

  - name: create-intro
    type: video
    model: veo
    prompt: "Logo animation reveal"

  - name: add-voiceover
    type: audio
    model: elevenlabs
    text: "Introducing the future of AI..."
    voice: professional-male

  - name: combine
    type: merge
    inputs: [create-intro, add-voiceover]

yaml

pipeline: product-demo
steps:
  - name: generate-logo
    type: image
    model: flux-pro
    prompt: "Modern tech logo for AI startup"

  - name: create-intro
    type: video
    model: veo
    prompt: "Logo animation reveal"

  - name: add-voiceover
    type: audio
    model: elevenlabs
    text: "Introducing the future of AI..."
    voice: professional-male

  - name: combine
    type: merge
    inputs: [create-intro, add-voiceover]

JSON Configuration

JSON配置

json

{
  "pipeline": "social-content",
  "parallel": true,
  "steps": [
    {
      "type": "image",
      "variants": 4,
      "prompt": "Product hero shot"
    }
  ]
}

json

{
  "pipeline": "social-content",
  "parallel": true,
  "steps": [
    {
      "type": "image",
      "variants": 4,
      "prompt": "Product hero shot"
    }
  ]
}

Cost Management

成本管理

Real-time Estimation

实时估算

Estimate cost for:
- 10 images (1024x1024)
- 2 videos (5 seconds)
- 1 audio (60 seconds)

Estimated: $2.45

Estimate cost for:
- 10 images (1024x1024)
- 2 videos (5 seconds)
- 1 audio (60 seconds)

Estimated: $2.45

Budget Limits

预算限制

yaml

budget:
  max_per_job: $5.00
  max_daily: $50.00
  alert_threshold: 80%

yaml

budget:
  max_per_job: $5.00
  max_daily: $50.00
  alert_threshold: 80%

Performance Features

性能特性

Parallel Execution

并行执行

Generate 10 image variants in parallel
Threads: 4
Expected speedup: 2-3x

Generate 10 image variants in parallel
Threads: 4
Expected speedup: 2-3x

Caching

缓存

Automatic prompt caching
Reuse similar generations
Reduce redundant API calls

自动提示缓存
复用相似生成结果
减少冗余API调用

CLI Commands

CLI命令

bash

undefined

bash

undefined

Image generation

video-agent image "prompt" --model flux-pro --size 1024

Video generation

video-agent video "prompt" --model veo --duration 5

Audio generation

video-agent audio "text" --voice professional-female

Pipeline execution

video-agent pipeline config.yaml

Cost check

video-agent cost --estimate

undefined

video-agent cost --estimate

undefined

Python API

python

from video_agent import ImageGenerator, VideoGenerator

python

from video_agent import ImageGenerator, VideoGenerator

Generate image

img = ImageGenerator(model="flux-pro") result = img.generate("sunset over mountains")

Generate video

vid = VideoGenerator(model="veo") result = vid.generate("timelapse of clouds")

undefined

vid = VideoGenerator(model="veo") result = vid.generate("timelapse of clouds")

undefined

Setup

安装设置

1. Install Package

1. 安装包

bash

pip install video-agent-claude-skill

bash

pip install video-agent-claude-skill

2. Configure API Keys

2. 配置API密钥

bash

export FAL_API_KEY="your-key"
export GOOGLE_VERTEX_KEY="your-key"
export ELEVENLABS_API_KEY="your-key"

bash

export FAL_API_KEY="your-key"
export GOOGLE_VERTEX_KEY="your-key"
export ELEVENLABS_API_KEY="your-key"

3. Verify Setup

3. 验证设置

bash

video-agent status

bash

video-agent status

Use Cases

使用场景

Marketing: Product images, promo videos
Social Media: Content at scale
Education: Explainer videos, voiceovers
Prototyping: Visual concepts, mockups
Automation: Batch content pipelines

营销领域：产品图像、宣传视频
社交媒体：规模化内容生产
教育领域：讲解视频、旁白配音
原型设计：视觉概念、模型原型
自动化：批量内容流水线

Credits

致谢

Created by donghaozhang. Licensed under MIT.

由donghaozhang开发。基于MIT许可证开源。