> storyboard-video-orchestrator
Plan and orchestrate a full AI‑generated video (e.g., a 10‑minute short film or promo) by breaking it into **5–10 second scenes**, attaching script lines and audio requirements to each scene, and coordinating downstream skills (video, music, SFX, voice) to produce all assets needed for final assembly.
curl "https://skillshub.wtf/tippyentertainment/skills/storyboard-video-orchestrator?format=md"Provided by TippyEntertainment
https://github.com/tippyentertainment/skills.git
This skill is designed for use on the Tasking.tech agent platform (https://tasking.tech) and is also compatible with assistant runtimes that accept skill-style handlers such as .claude, .openai, and .mistral. Use this skill for both Claude code and Tasking.tech agent source.
storyboard-video-orchestrator
Summary
Plan and orchestrate a full AI‑generated video (e.g., a 10‑minute short film or promo) by breaking it into 5–10 second scenes, attaching script lines and audio requirements to each scene, and coordinating downstream skills (video, music, SFX, voice) to produce all assets needed for final assembly.
This skill focuses on storyboarding, structuring, and orchestration, not on low‑level rendering.
When to Use
Use this skill when the user wants to:
- Turn an idea or script into a scene‑by‑scene storyboard for an AI‑generated video.
- Build longer videos (e.g., ~10 minutes) from multiple short clips generated by ComfyUI or similar tools.
- Attach dialogue/voice‑over, music, and sound effects to each scene.
- Produce a production plan that other skills can execute to render video and audio assets.
Typical use cases:
- Anime‑style shorts or trailers.
- Product or platform promos (e.g., for tasking.tech).
- Narrative explainer videos or cinematic demos.
- “AI movies” constructed from many short clips.
Inputs to Collect
The assistant should ask for:
High‑Level Video Brief
- Goal / purpose
- e.g. brand promo, narrative short, tutorial, trailer.
- Target total duration
- e.g. 10 minutes (default), or a specific range.
- Tone & style
- Anime, cinematic, documentary, playful, serious, etc.
- Visual style references
- Keywords or reference works (without copying them), art styles, color palettes.
- Audience / rating
- General, teen, mature.
Story / Content
- Source material
- Existing script, outline, or just a high‑level premise.
- Characters & setting
- Main characters, roles, important locations.
- Key beats
- Moments that must appear (introductions, reveals, climax, call‑to‑action).
Audio Requirements
- Voice‑over / dialogue
- Narration only, character dialogue, or both.
- Music
- Style (genre, tempo, mood), whether continuous or scene‑based.
- Sound effects
- Level of detail (just key effects vs rich sound design).
If any of these are missing, the skill should ask 2–4 clarifying questions before generating the storyboard.
Expected Behavior
1. Break the Story into Scenes
- Determine the number of scenes based on target duration and 5–10 second clips:
- For a 10‑minute video (600 seconds), expect roughly 60–100 scenes.
- Optionally group scenes into chapters/segments (e.g., intro, body, outro).
- For each scene:
- Assign a scene number and approximate duration (5–10 seconds).
- Write a short description of what happens visually.
- Note camera/motion style (static, pan, zoom, orbit, etc.).
2. Attach Script to Each Scene
- Take the user’s script or generate one consistent with the brief.
- Split the script across scenes:
- For voice‑over, align lines to scenes based on pacing.
- For dialogue, map lines to characters and scenes where they speak.
- For each scene, attach:
voiceoverText(if any).dialogueLines(character → line).- Notes about timing (e.g., line starts mid‑scene, at second 3).
3. Attach Audio Requirements
For each scene, define:
- Music
- Whether scene uses:
- Global background track, or
- A specific musical cue (e.g., “builds tension”, “drops out here”).
- Whether scene uses:
- Ambience
- Environment sound: city, forest, office, spaceship, etc.
- Sound FX
- List important effects: footsteps, doors, UI beeps, magic attacks, explosions, etc.
- Voice
- Which voice(s) are needed:
- Narrator.
- Character voices (which characters, approximate lines).
- Which voice(s) are needed:
This creates a per‑scene audio spec that downstream skills can implement.
4. Produce a Structured Storyboard
The main output is a structured storyboard document (JSON‑like, or a table) with entries like:
sceneNumberstartTime/endTime(cumulative)durationSecondsvisualDescriptioncameraStylevideoPrompt(for video generator)voiceoverTextdialogue(array of{ character, line })musicSpec(mood, intensity, references)ambienceSpecsfxSpec(list of effects)notes(continuity, transitions, overlays, titles)
The skill should also generate a human‑readable version (e.g., Markdown table) for review.
5. Orchestrate Downstream Skills (Conceptually)
This skill does not run external tools itself, but it should explicitly prepare tasks for other skills:
- For video generation (e.g.,
comfyui-video-generator):- Provide, per scene:
videoPrompt,durationSeconds,fps,resolution, reference images.
- Provide, per scene:
- For voice generation (
comfyui-voice-generator):- Provide, per scene or sequence:
voiceoverText, speaker style, language, pace, target duration.
- Provide, per scene or sequence:
- For music/ambience (
comfyui-audio-creator):- Provide:
- Segment durations and mood/genre per chapter or scene group.
- Provide:
- For sound effects (
comfyui-soundfx-creator):- Provide:
- SFX lists with timestamps within each scene.
- Provide:
The skill should output these as clearly labeled sections or structured objects so an orchestrator can call the other skills in the correct order.
Output Format (to the Caller)
By default, respond with:
-
Overview
- 3–6 sentences summarizing the planned video (story, tone, length, structure).
-
Global Plan
- Bullet list:
- Target duration.
- Number of scenes.
- Chapter/segment breakdown (if used).
- Bullet list:
-
Storyboard Table (Condensed View)
- A Markdown table with one row per scene, including:
- Scene #
- Time range
- Duration
- Visual description
- Key audio (voice/music/SFX keywords)
- Keep descriptions concise but clear.
- A Markdown table with one row per scene, including:
-
Detailed Scene Specs
- For each scene, a short block including:
- Visual description & camera style.
videoPrompt.voiceoverTextand/or dialogue lines.musicSpec,ambienceSpec,sfxSpec.
- This section is the “API contract” for video/audio skills.
- For each scene, a short block including:
-
Orchestration Plan
- Numbered steps like:
- Use
comfyui-video-generatorto render all scenes with given prompts/durations. - Use
comfyui-voice-generatorto generate narration/dialogue per scene or per segment. - Use
comfyui-audio-creatorfor background music/ambience per segment. - Use
comfyui-soundfx-creatorfor per‑event SFX per scene. - Assemble all assets in an editor (or export pipeline) into a single ~10‑minute video.
- Use
- Numbered steps like:
-
Next Actions
- Clear checklist for the user, e.g.:
- Approve/adjust storyboard.
- Select models/styles for video and audio.
- Kick off asset generation for each scene.
- Clear checklist for the user, e.g.:
Orchestration Notes
- This skill is upstream of:
comfyui-video-generatorcomfyui-voice-generatorcomfyui-audio-creatorcomfyui-soundfx-creator
- It should:
- Maintain continuity of characters, locations, and visual style across scenes.
- Ensure cumulative duration matches the target (e.g., 10 minutes ± ~5%).
- Aim for scenes in the 5–10 second range; use shorter shots for action, longer shots for exposition.
- For iterative workflows:
- Accept an existing storyboard and:
- Add or remove scenes.
- Adjust durations.
- Rewrite prompts/lines for clarity or style.
- Keep scene IDs stable where possible so previously generated assets can be reused.
- Accept an existing storyboard and:
This skill does not perform final rendering or editing; it creates a production‑ready blueprint and detailed per‑scene specs that other skills and tools can execute to build the complete video.
> related_skills --same-repo
> worldclass-tailwind-v4-visual-design
A top-tier product/UI designer skill that uses Tailwind v4 plus Google Gemini Nano Banana image models to craft visually stunning, “award‑winning” marketing sites and apps with strong art direction, motion, and systems thinking.
> wasm-spa-autofix-react-imports
Meticulously detect and fix missing React/TSX imports, undefined components, and bundler runtime errors in the WASM SPA build/preview pipeline. Ensures JSX components, icons, and hooks are properly imported or defined before running the browser preview, so the runtime safety-net rarely triggers.
> vite-webcontainer-developer
Debug and auto-fix Vite projects running inside WebContainers: resolve mount/root issues, alias/path errors, missing scripts, and other common dev-time problems so the app boots cleanly.
> vite-config-react19-spa-expert
Diagnose and fix Vite + React 19 configuration issues for TypeScript SPA and WASM preview builds. Specializes in React 19’s JSX runtime, @vitejs/plugin-react, path aliases, SPA routing, and dev-server behavior so the app and in-browser preview bundle cleanly without manual trial-and-error.