ai-video-gen
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAI Video Generation Skill
AI视频生成技能
Generate complete videos from text descriptions using AI.
通过AI从文本描述生成完整视频。
Capabilities
功能特性
- Image Generation - DALL-E 3, Stable Diffusion, Flux
- Video Generation - LumaAI, Runway, Replicate models
- Voice-over - OpenAI TTS, ElevenLabs
- Video Editing - FFmpeg assembly, transitions, overlays
- 图像生成 - DALL-E 3、Stable Diffusion、Flux
- 视频生成 - LumaAI、Runway、Replicate模型
- 旁白配音 - OpenAI TTS、ElevenLabs
- 视频编辑 - FFmpeg拼接、转场、叠加效果
Quick Start
快速开始
bash
undefinedbash
undefinedGenerate a complete video
生成完整视频
python skills/ai-video-gen/generate_video.py --prompt "A sunset over mountains" --output sunset.mp4
python skills/ai-video-gen/generate_video.py --prompt "A sunset over mountains" --output sunset.mp4
Just images to video
仅将图像转为视频
python skills/ai-video-gen/images_to_video.py --images img1.png img2.png --output result.mp4
python skills/ai-video-gen/images_to_video.py --images img1.png img2.png --output result.mp4
Add voiceover
添加旁白配音
python skills/ai-video-gen/add_voiceover.py --video input.mp4 --text "Your narration" --output final.mp4
undefinedpython skills/ai-video-gen/add_voiceover.py --video input.mp4 --text "Your narration" --output final.mp4
undefinedSetup
配置步骤
Required API Keys
所需API密钥
Add to your environment or file:
.envbash
undefined添加到环境变量或文件中:
.envbash
undefinedImage Generation (pick one)
图像生成(选择其一)
OPENAI_API_KEY=sk-... # DALL-E 3
REPLICATE_API_TOKEN=r8_... # Stable Diffusion, Flux
OPENAI_API_KEY=sk-... # DALL-E 3
REPLICATE_API_TOKEN=r8_... # Stable Diffusion、Flux
Video Generation (pick one)
视频生成(选择其一)
LUMAAI_API_KEY=luma_... # LumaAI Dream Machine
RUNWAY_API_KEY=... # Runway ML
REPLICATE_API_TOKEN=r8_... # Multiple models
LUMAAI_API_KEY=luma_... # LumaAI Dream Machine
RUNWAY_API_KEY=... # Runway ML
REPLICATE_API_TOKEN=r8_... # 多模型支持
Voice (optional)
配音(可选)
OPENAI_API_KEY=sk-... # OpenAI TTS
ELEVENLABS_API_KEY=... # ElevenLabs
OPENAI_API_KEY=sk-... # OpenAI TTS
ELEVENLABS_API_KEY=... # ElevenLabs
Or use FREE local options (no API needed)
或使用免费本地选项(无需API)
undefinedundefinedInstall Dependencies
安装依赖
bash
pip install openai requests pillow replicate python-dotenvbash
pip install openai requests pillow replicate python-dotenvFFmpeg
FFmpeg
Already installed via winget.
已通过winget安装。
Usage Examples
使用示例
1. Text to Video (Full Pipeline)
1. 文本转视频(完整流程)
bash
python skills/ai-video-gen/generate_video.py \
--prompt "A futuristic city at night with flying cars" \
--duration 5 \
--voiceover "Welcome to the future" \
--output future_city.mp4bash
python skills/ai-video-gen/generate_video.py \
--prompt "A futuristic city at night with flying cars" \
--duration 5 \
--voiceover "Welcome to the future" \
--output future_city.mp42. Multiple Scenes
2. 多场景视频
bash
python skills/ai-video-gen/multi_scene.py \
--scenes "Morning sunrise" "Busy city street" "Peaceful night" \
--duration 3 \
--output day_in_life.mp4bash
python skills/ai-video-gen/multi_scene.py \
--scenes "Morning sunrise" "Busy city street" "Peaceful night" \
--duration 3 \
--output day_in_life.mp43. Image Sequence to Video
3. 图像序列转视频
bash
python skills/ai-video-gen/images_to_video.py \
--images frame1.png frame2.png frame3.png \
--fps 24 \
--output animation.mp4bash
python skills/ai-video-gen/images_to_video.py \
--images frame1.png frame2.png frame3.png \
--fps 24 \
--output animation.mp4Workflow Options
工作流选项
Budget Mode (FREE)
预算模式(免费)
- Image: Stable Diffusion (local or free API)
- Video: Open source models
- Voice: OpenAI TTS (cheap) or free TTS
- Edit: FFmpeg
- 图像:Stable Diffusion(本地或免费API)
- 视频:开源模型
- 配音:OpenAI TTS(低成本)或免费TTS工具
- 编辑:FFmpeg
Quality Mode (Paid)
高质量模式(付费)
- Image: DALL-E 3 or Midjourney
- Video: Runway Gen-3 or LumaAI
- Voice: ElevenLabs
- Edit: FFmpeg + effects
- 图像:DALL-E 3或Midjourney
- 视频:Runway Gen-3或LumaAI
- 配音:ElevenLabs
- 编辑:FFmpeg + 特效
Scripts Reference
脚本参考
- - Main end-to-end generator
generate_video.py - - Convert image sequence to video
images_to_video.py - - Add narration to existing video
add_voiceover.py - - Create multi-scene videos
multi_scene.py - - Apply effects, transitions, overlays
edit_video.py
- - 核心端到端生成脚本
generate_video.py - - 将图像序列转换为视频
images_to_video.py - - 为现有视频添加旁白
add_voiceover.py - - 创建多场景视频
multi_scene.py - - 应用特效、转场、叠加效果
edit_video.py
API Cost Estimates
API成本估算
- DALL-E 3: ~$0.04-0.08 per image
- Replicate: ~$0.01-0.10 per generation
- LumaAI: $0-0.50 per 5sec (free tier available)
- Runway: ~$0.05 per second
- OpenAI TTS: ~$0.015 per 1K characters
- ElevenLabs: ~$0.30 per 1K characters (better quality)
- DALL-E 3: 约0.04-0.08美元/张图像
- Replicate: 约0.01-0.10美元/次生成
- LumaAI: 每5秒0-0.50美元(提供免费额度)
- Runway: 约0.05美元/秒
- OpenAI TTS: 约0.015美元/千字符
- ElevenLabs: 约0.30美元/千字符(音质更佳)
Examples
示例
See folder for sample outputs and prompts.
examples/请查看文件夹获取示例输出和提示词。
examples/