> video-producer-agent
Use this skill to create complete videos with voiceover and music. Triggers: "create video", "product video", "explainer video", "promo video", "demo video", "training video", "ad video", "commercial", "marketing video", "video with voiceover", "video with music", "brand video", "testimonial video" Orchestrates: script, voiceover, background music, video clips/images, and final assembly.
curl "https://skillshub.wtf/michaelboeding/skills/video-producer-agent?format=md"Video Producer
Create complete videos with voiceover, music, and visuals.
This is an orchestrator skill that combines:
- Script/storyboard generation (Claude)
- Voiceover synthesis (Gemini TTS)
- Background music (Lyria)
- Video clip generation (Veo 3.1) or image animation
- Final assembly (FFmpeg via media-utils)
Workflow
Step 1: Gather Requirements (REQUIRED)
⚠️ DO NOT skip this step. DO NOT run init_project.py until you have ALL answers.
Use interactive questioning — ask ONE question at a time, wait for the response, then ask the next. This creates a collaborative spec-driven process.
Question Flow
⚠️ Use the AskUserQuestion tool for each question below. Do not just print questions in your response — use the tool to create interactive prompts with the options shown.
Q1: Subject
"I'll create that video! First — what's it about?
(e.g., product launch, brand story, tutorial, explainer — or describe your own)"
Wait for response.
Q2: Duration
"How long should the video be?
- 15 seconds (quick hook)
- 30 seconds (standard ad)
- 60 seconds (explainer)
- 2+ minutes (detailed)
- Or specify your own duration"
Wait for response.
Q3: Style
"What visual style?
- Premium/luxury
- Fun/playful
- Corporate/professional
- Dramatic/cinematic
- Minimal/clean
- Or describe your own style"
Wait for response.
Q4: Assets
"Do you have existing images or video clips to use?
- No, generate everything
- Yes, I have images (provide paths)
- Yes, I have video clips (provide paths)"
Wait for response.
Q5: Audio Strategy
"How should we handle audio?
- Custom — I generate voiceover + background music
- Veo native — Use Veo's built-in dialogue/SFX/ambient
- Silent — No audio, add later"
Wait for response.
Q6: Voice (if custom audio)
"What voice tone for the voiceover?
- Professional
- Friendly/warm
- Energetic
- Calm/soothing
- Dramatic
- Or describe your own tone"
Wait for response.
Q7: Music (if custom audio)
"What music vibe?
- Modern electronic
- Cinematic/epic
- Upbeat pop
- Ambient/chill
- Corporate
- Or describe your own style"
Wait for response.
Q8: Format
"What aspect ratio?
- 16:9 (YouTube, web)
- 9:16 (TikTok, Reels, Shorts)
- 1:1 (Instagram feed)"
Wait for response.
Q9: Resolution
"What resolution?
- 720p (faster generation)
- 1080p (standard HD)"
Wait for response.
Q10: Model
"Which Veo model?
veo-3.1— Latest, highest quality (default)veo-3.1-fast— Faster generation, slightly lower qualityveo-3— Previous generationveo-3-fast— Previous gen, faster"
Wait for response.
Quick Reference
| Question | Determines |
|---|---|
| Subject | Scene content and prompts |
| Duration | Scene count (Veo clips must be 4, 6, or 8 seconds) |
| Style | Visual prompts and music selection |
| Assets | Generate vs use existing |
| Audio | custom, veo_audio, or silent |
| Voice | TTS voice selection |
| Music | Lyria prompt |
| Format | Aspect ratio for Veo |
| Resolution | 720p or 1080p output quality |
| Model | veo-3.1, veo-3.1-fast, veo-3, veo-3-fast |
Step 2: Initialize Project
Once you have the user's answers, initialize the project with their preferences:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/init_project.py \
--name "Product Launch Video" \
--duration 30 \
--aspect-ratio 16:9 \
--audio-strategy custom \
--scenes 5
Step 3: Configure project.json
Edit project.json with scene prompts, voiceover text, and music style based on user's answers.
Step 4: Assemble the Video
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-producer-agent/scripts/assemble.py \
--project ~/my_video_project/
Project Structure
When you initialize a project, this folder structure is created:
my_project/
├── project.json # Configuration: scenes, voiceover, music, settings
├── storyboard.md # Planning document for the video
├── scenes/ # Generated video clips from Veo
│ ├── scene1_intro.mp4
│ ├── scene2_features.mp4
│ └── scene3_cta.mp4
├── audio/ # Audio assets
│ ├── voiceover.wav # Generated voiceover
│ ├── background_music.wav # Generated music
│ └── final_mix.mp3 # Mixed audio track
├── work/ # Intermediate files (auto-generated)
│ ├── silent_scene1.mp4
│ ├── video_concatenated.mp4
│ └── ...
└── output/ # Final deliverables
└── product_launch_video_final.mp4
Scripts
init_project.py
Initialize a new video project with folder structure and templates.
# Basic project
python3 init_project.py --name "My Video" --duration 30
# Create in specific directory
python3 init_project.py --name "Demo Video" --output ~/Videos/
# Vertical video for social
python3 init_project.py --name "Instagram Reel" --aspect-ratio 9:16 --duration 15
# Use Veo's native audio (no custom voiceover/music)
python3 init_project.py --name "Cinematic Scene" --audio-strategy veo_audio
# More scenes
python3 init_project.py --name "Long Video" --duration 60 --scenes 5
Options:
| Option | Default | Description |
|---|---|---|
--name | required | Project name |
--output | current dir | Parent directory |
--duration | 30 | Target duration in seconds |
--aspect-ratio | 16:9 | 16:9, 9:16, 1:1, 4:3 |
--audio-strategy | custom | custom, veo_audio, silent |
--scenes | 3 | Number of scene placeholders |
assemble.py
Orchestrate the full video assembly pipeline.
# Full pipeline (generate everything + assemble)
python3 assemble.py --project ~/my_project/
# Skip generation (use existing scene/audio files)
python3 assemble.py --project ~/my_project/ --skip-generation
# Dry run (show what would be done)
python3 assemble.py --project ~/my_project/ --dry-run
Pipeline steps:
- Generate video scenes (Veo 3.1)
- Strip audio from scenes (if custom audio)
- Generate voiceover (Gemini TTS)
- Generate background music (Lyria)
- Mix voiceover + music
- Concatenate video clips
- Merge audio with video
- Output final video
project.json Configuration
{
"name": "Product Launch Video",
"duration_target": 30,
"aspect_ratio": "16:9",
"resolution": "720p",
"audio_strategy": "custom",
"scenes": [
{
"id": 1,
"name": "scene1_hero",
"prompt": "Cinematic slow zoom on premium product, dramatic lighting, high-end commercial style",
"duration": 6,
"notes": "Music only, no voiceover"
},
{
"id": 2,
"name": "scene2_features",
"prompt": "Product features demonstration, sleek animations, modern tech aesthetic",
"duration": 8,
"notes": "Voiceover starts here"
},
{
"id": 3,
"name": "scene3_cta",
"prompt": "Product with logo on clean background, call to action moment",
"duration": 6,
"notes": "Music swells, voiceover ends"
}
],
"voiceover": {
"enabled": true,
"text": "Introducing the future of audio. Crystal clear sound. All-day comfort. Experience the difference.",
"voice": "Charon",
"style": "Professional, confident, premium brand voice"
},
"music": {
"enabled": true,
"prompt": "modern electronic, premium, sleek, product showcase, subtle bass",
"duration": 35,
"bpm": 100,
"brightness": 0.6
},
"assembly": {
"transition": "fade",
"transition_duration": 0.5,
"music_volume": 0.3,
"fade_in": 1.0,
"fade_out": 2.0
}
}
Audio Strategies
| Strategy | Description | Use When |
|---|---|---|
custom | Strip Veo audio, add custom voiceover + music | Most videos |
veo_audio | Keep Veo's generated audio (dialogue, SFX) | Cinematic scenes, dialogues |
silent | Strip audio, output silent video | Adding audio later |
Workflow: Creating a Video
Step 1: Initialize Project
python3 init_project.py --name "Wireless Earbuds Promo" --duration 30
Step 2: Plan the Storyboard
Edit storyboard.md to plan your video structure:
## Scene 1: Hero Reveal (0-5s)
- Visual: Earbuds emerging from shadow, premium lighting
- Audio: Music only (dramatic intro)
## Scene 2: Sound Quality (5-12s)
- Visual: Sound waves, person enjoying music
- Audio: Voiceover: "Crystal clear sound. Immersive bass."
## Scene 3: Comfort (12-20s)
- Visual: Close-up of fit, person running
- Audio: Voiceover: "All-day comfort. Secure fit."
## Scene 4: CTA (20-30s)
- Visual: Product + logo
- Audio: Voiceover: "Experience the difference." + music swell
Step 3: Configure project.json
Fill in the scene prompts, voiceover text, and music style based on your storyboard.
Step 4: Assemble
python3 assemble.py --project ~/wireless_earbuds_promo/
Step 5: Review and Iterate
Check the output in output/. If adjustments needed:
# Re-run with existing scenes (just re-mix audio)
python3 assemble.py --project ~/wireless_earbuds_promo/ --skip-generation
What You Can Create
| Type | Example |
|---|---|
| Product video | 30s hero video showcasing a product |
| Explainer video | How-to or feature explanation |
| Promo/ad video | Marketing advertisement |
| Demo video | Product demonstration |
| Training video | Internal training content |
| Testimonial | Customer quote style video |
| Brand video | Company/brand story |
Prerequisites
GOOGLE_API_KEY- For Veo (video), Gemini TTS (voice), Lyria (music)- FFmpeg installed:
brew install ffmpeg
Video Styles & Music Pairings
| Style | Music Prompt | Voice |
|---|---|---|
| Premium/Luxury | "elegant, minimal, ambient, sophisticated" | Charon (informative) |
| Tech/Modern | "electronic, futuristic, clean, innovative" | Kore (firm) |
| Fun/Playful | "upbeat, cheerful, acoustic, positive" | Puck (upbeat) |
| Corporate | "professional, inspiring, orchestral lite" | Orus (firm) |
| Lifestyle | "chill, aspirational, indie, warm" | Aoede (breezy) |
| Dramatic/Cinematic | "epic, orchestral, emotional, building" | Gacrux (mature) |
Common Video Structures
Product Video (30s)
0-5s: Hero shot (music only)
5-15s: Features (voiceover + music)
15-25s: Lifestyle/use case (voiceover + music)
25-30s: Logo + CTA (music fade)
Explainer Video (60s)
0-5s: Hook/problem statement
5-20s: Solution introduction
20-45s: How it works (3 steps)
45-55s: Benefits summary
55-60s: CTA
Testimonial Video (45s)
0-5s: Intro/name card
5-35s: Testimonial quote (multiple scenes)
35-45s: Product shot + logo
Manual Workflow (Without Scripts)
If you prefer to run each step manually:
Generate scenes:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/video-generation/scripts/veo.py \
--batch scenes.json
Generate voiceover:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/voice-generation/scripts/gemini_tts.py \
--text "Your voiceover text..." \
--voice Charon \
--style "Professional, warm"
Generate music:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/music-generation/scripts/lyria.py \
--prompt "modern electronic, premium" \
--duration 35
Strip audio from clips:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_strip_audio.py \
-i scene*.mp4
Concatenate videos:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_concat.py \
-i silent_*.mp4 --transition fade -o video.mp4
Mix audio:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/audio_mix.py \
--voice voiceover.wav --music music.wav -o audio.mp3
Merge audio + video:
python3 ${CLAUDE_PLUGIN_ROOT}/skills/media-utils/scripts/video_audio_merge.py \
--video video.mp4 --audio audio.mp3 -o final.mp4
Input Files You Can Provide
| File Type | How It's Used |
|---|---|
| Product images | Animate with Veo as first frame |
| Logo (PNG) | Overlay on final scene |
| Existing voiceover | Place in audio/voiceover.wav |
| Brand music | Place in audio/background_music.wav |
| Video clips | Place in scenes/ folder |
| Script/copy | Use for voiceover text |
Limitations
- Veo video duration: Max 8 seconds per clip (concatenate for longer)
- Veo 3.1 includes audio: All clips have generated audio (strip if using custom)
- Processing time: Video generation takes 1-3 minutes per clip
- Resolution: Currently 720p or 1080p (1080p for 8s only)
Error Handling
| Error | Solution |
|---|---|
| "GOOGLE_API_KEY not set" | Set up API key per README |
| "FFmpeg not found" | Install: brew install ffmpeg |
| "project.json not found" | Run init_project.py first |
| Video generation timeout | Retry, or use shorter duration |
| Audio/video sync issues | Adjust scene durations |
Example Prompts
Simple:
"Create a 30-second product video for my new coffee maker"
Detailed:
"Create a 45-second product video for our new wireless earbuds. Premium, luxury feel. I have product photos attached. Professional male voiceover. Modern electronic music. 16:9 for YouTube."
With project:
"Initialize a video project called 'App Demo' with 5 scenes, 60 seconds total, vertical format for TikTok"
> related_skills --same-repo
> xlsx
Comprehensive spreadsheet creation, editing, and analysis with support for formulas, formatting, data analysis, and visualization. When Claude needs to work with spreadsheets (.xlsx, .xlsm, .csv, .tsv, etc) for: (1) Creating new spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modify existing spreadsheets while preserving formulas, (4) Data analysis and visualization in spreadsheets, or (5) Recalculating formulas
> voice-generation
Use this skill for AI text-to-speech generation. Triggers include: "generate voice", "create audio", "text to speech", "TTS", "read this aloud", "generate narration", "create voiceover", "synthesize speech", "podcast audio", "dialogue audio", "multi-speaker", "audiobook" Supports Google Gemini TTS, ElevenLabs, and OpenAI TTS.
> video-generation
Use this skill for AI video generation. Triggers include: "generate video", "create video", "make video", "animate", "text to video", "video from image", "video of", "animate image", "bring to life", "make it move", "add motion", "video with audio", "video with dialogue" Supports text-to-video, image-to-video, video with dialogue/audio using Google Veo 3.1 (default) or OpenAI Sora.
> style-guide
Analyze a codebase to extract its conventions, patterns, and style. Spawns specialized analyzer agents that each focus on one aspect (structure, naming, patterns, testing, frontend). Generates a comprehensive style guide that other skills can reference. Use when starting work on an unfamiliar codebase, or to create explicit documentation of implicit conventions.