video-agent
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseVideo Agent - AI Content Generation Suite
Video Agent - AI内容生成套件
A comprehensive AI content generation package providing a unified interface across 35+ models for image, video, and audio creation.
这是一个全面的AI内容生成包,为图像、视频和音频创作提供了统一接口,支持35+种模型。
When to Use This Skill
何时使用该Skill
- Text-to-image generation
- Image-to-image transformations
- Text-to-video creation
- Image-to-video animation
- Professional text-to-speech
- Multi-step content pipelines
- Batch content generation
- 文本转图像生成
- 图像转图像变换
- 文本转视频创作
- 图像转视频动画
- 专业文本转语音
- 多步骤内容流水线
- 批量内容生成
Supported Providers
支持的提供商
FAL AI
FAL AI
- FLUX models (text-to-image)
- Image transformations
- Fast inference
- FLUX模型(文本转图像)
- 图像变换
- 快速推理
Google Vertex AI
Google Vertex AI
- Imagen 4 (text-to-image)
- Veo (text-to-video)
- High quality outputs
- Imagen 4(文本转图像)
- Veo(文本转视频)
- 高质量输出
ElevenLabs
ElevenLabs
- 20+ voice options
- Professional TTS
- Multiple languages
- 20+种语音选项
- 专业TTS
- 多语言支持
OpenRouter
OpenRouter
- Access to various LLMs
- Text generation
- Content writing
- 访问各类LLM
- 文本生成
- 内容写作
Core Capabilities
核心功能
Image Generation
图像生成
Generate image:
Prompt: "A serene Japanese garden at sunset"
Model: flux-pro
Size: 1024x1024
Style: photorealisticAvailable Models:
- FLUX Pro/Dev (FAL)
- Imagen 4 (Google)
- Stable Diffusion variants
Generate image:
Prompt: "A serene Japanese garden at sunset"
Model: flux-pro
Size: 1024x1024
Style: photorealistic可用模型:
- FLUX Pro/Dev (FAL)
- Imagen 4 (Google)
- Stable Diffusion变体
Video Creation
视频创作
Generate video:
Prompt: "Ocean waves crashing on rocky shore"
Model: veo
Duration: 5 seconds
Resolution: 1080pAvailable Models:
- Google Veo
- MiniMax Hailuo
- Kling
Generate video:
Prompt: "Ocean waves crashing on rocky shore"
Model: veo
Duration: 5 seconds
Resolution: 1080p可用模型:
- Google Veo
- MiniMax Hailuo
- Kling
Image-to-Video
图像转视频
Animate image:
Source: /path/to/image.png
Motion: "gentle zoom out with particle effects"
Duration: 4 secondsAnimate image:
Source: /path/to/image.png
Motion: "gentle zoom out with particle effects"
Duration: 4 secondsText-to-Speech
文本转语音
Generate audio:
Text: "Welcome to our product demo..."
Voice: professional-female-1
Speed: 1.0
Output: welcome.mp3Voice Options:
- Professional male/female
- Casual conversational
- Narrator styles
- Multiple accents
Generate audio:
Text: "Welcome to our product demo..."
Voice: professional-female-1
Speed: 1.0
Output: welcome.mp3语音选项:
- 专业男声/女声
- 日常对话风格
- 旁白风格
- 多种口音
Pipeline Orchestration
流水线编排
YAML Configuration
YAML配置
yaml
pipeline: product-demo
steps:
- name: generate-logo
type: image
model: flux-pro
prompt: "Modern tech logo for AI startup"
- name: create-intro
type: video
model: veo
prompt: "Logo animation reveal"
- name: add-voiceover
type: audio
model: elevenlabs
text: "Introducing the future of AI..."
voice: professional-male
- name: combine
type: merge
inputs: [create-intro, add-voiceover]yaml
pipeline: product-demo
steps:
- name: generate-logo
type: image
model: flux-pro
prompt: "Modern tech logo for AI startup"
- name: create-intro
type: video
model: veo
prompt: "Logo animation reveal"
- name: add-voiceover
type: audio
model: elevenlabs
text: "Introducing the future of AI..."
voice: professional-male
- name: combine
type: merge
inputs: [create-intro, add-voiceover]JSON Configuration
JSON配置
json
{
"pipeline": "social-content",
"parallel": true,
"steps": [
{
"type": "image",
"variants": 4,
"prompt": "Product hero shot"
}
]
}json
{
"pipeline": "social-content",
"parallel": true,
"steps": [
{
"type": "image",
"variants": 4,
"prompt": "Product hero shot"
}
]
}Cost Management
成本管理
Real-time Estimation
实时估算
Estimate cost for:
- 10 images (1024x1024)
- 2 videos (5 seconds)
- 1 audio (60 seconds)
Estimated: $2.45Estimate cost for:
- 10 images (1024x1024)
- 2 videos (5 seconds)
- 1 audio (60 seconds)
Estimated: $2.45Budget Limits
预算限制
yaml
budget:
max_per_job: $5.00
max_daily: $50.00
alert_threshold: 80%yaml
budget:
max_per_job: $5.00
max_daily: $50.00
alert_threshold: 80%Performance Features
性能特性
Parallel Execution
并行执行
Generate 10 image variants in parallel
Threads: 4
Expected speedup: 2-3xGenerate 10 image variants in parallel
Threads: 4
Expected speedup: 2-3xCaching
缓存
- Automatic prompt caching
- Reuse similar generations
- Reduce redundant API calls
- 自动提示缓存
- 复用相似生成结果
- 减少冗余API调用
CLI Commands
CLI命令
bash
undefinedbash
undefinedImage generation
Image generation
video-agent image "prompt" --model flux-pro --size 1024
video-agent image "prompt" --model flux-pro --size 1024
Video generation
Video generation
video-agent video "prompt" --model veo --duration 5
video-agent video "prompt" --model veo --duration 5
Audio generation
Audio generation
video-agent audio "text" --voice professional-female
video-agent audio "text" --voice professional-female
Pipeline execution
Pipeline execution
video-agent pipeline config.yaml
video-agent pipeline config.yaml
Cost check
Cost check
video-agent cost --estimate
undefinedvideo-agent cost --estimate
undefinedPython API
Python API
python
from video_agent import ImageGenerator, VideoGeneratorpython
from video_agent import ImageGenerator, VideoGeneratorGenerate image
Generate image
img = ImageGenerator(model="flux-pro")
result = img.generate("sunset over mountains")
img = ImageGenerator(model="flux-pro")
result = img.generate("sunset over mountains")
Generate video
Generate video
vid = VideoGenerator(model="veo")
result = vid.generate("timelapse of clouds")
undefinedvid = VideoGenerator(model="veo")
result = vid.generate("timelapse of clouds")
undefinedSetup
安装设置
1. Install Package
1. 安装包
bash
pip install video-agent-claude-skillbash
pip install video-agent-claude-skill2. Configure API Keys
2. 配置API密钥
bash
export FAL_API_KEY="your-key"
export GOOGLE_VERTEX_KEY="your-key"
export ELEVENLABS_API_KEY="your-key"bash
export FAL_API_KEY="your-key"
export GOOGLE_VERTEX_KEY="your-key"
export ELEVENLABS_API_KEY="your-key"3. Verify Setup
3. 验证设置
bash
video-agent statusbash
video-agent statusUse Cases
使用场景
- Marketing: Product images, promo videos
- Social Media: Content at scale
- Education: Explainer videos, voiceovers
- Prototyping: Visual concepts, mockups
- Automation: Batch content pipelines
- 营销领域:产品图像、宣传视频
- 社交媒体:规模化内容生产
- 教育领域:讲解视频、旁白配音
- 原型设计:视觉概念、模型原型
- 自动化:批量内容流水线
Credits
致谢
Created by donghaozhang. Licensed under MIT.
由donghaozhang开发。基于MIT许可证开源。