AI Short Drama Storyboard Planning Guide
What you will learn
- Which shot sizes and camera motions fit which storytelling goals
- What T2I and I2V prompts each control during storyboarding
- How transitions and previous-frame continuity improve coherence
- How to shape pacing with duration, emotion, and the AI optimization panel
- How audio, versioning, and collaboration fit into the storyboard stage
1. What a storyboard is
A storyboard turns the story into a sequence of executable shots. Each shot must tell AI not only what to show, but also how it should move, how long it should last, and what emotion it carries.
- Scene description: visual content and environment
- Shot type: close-up, medium shot, long shot, and so on
- Camera motion: push, pull, pan, tilt, static, tracking
- Duration: how long the shot lasts
- Characters and dialogue: who is present and who speaks
- Emotion and transition: the mood and how the next shot connects
2. Shot language basics
| Shot size | Frame range | Best for |
|---|---|---|
| Close-up | Face or detail | Emotion shifts and expression peaks |
| Medium shot | Half body or small action | Dialogue and everyday movement |
| Full shot | Full body | Action and character relationships |
| Long shot | Environment plus people | Establishing space and transitions |
| Extreme long shot | Environment-led frame | Openings, endings, and time shifts |
Camera motion types
| Motion | Effect | Best for |
|---|---|---|
| Push in | Moves closer to the subject | Focus and tension |
| Pull out | Moves away from the subject | Revealing the whole situation and releasing emotion |
| Pan | Horizontal sweep | Following movement and showing space |
| Tilt | Vertical angle shift | Status difference and visual power |
| Static | No movement | Stable dialogue and calm narration |
| Tracking | Moves with the subject | Action and character motion |
3. Storyboard image and video generation
T2I prompts
T2I controls composition, lighting, pose, and environmental detail for the still frame. It decides what the shot looks like.
⚠️ The system evaluates prompt quality automatically and shows yellow warnings when the prompt is weak.
I2V prompts
I2V controls movement direction, speed, and dynamic change. It decides how the shot moves once it becomes video.
Model selection
- Image models: Tongyi Wanxiang / Doubao / Kling
- Video models: Tongyi / Kling / Doubao / Vidu
Automatic character reference injection
- The system injects locked character references into the current generation chain
- Multi-character scenes pull references for each character involved
- Single-character scenes strengthen consistency constraints
Batch generation
- Generate all storyboard images in one run
- Generate all storyboard videos in one run
- Track progress from a unified status panel
Credit usage
| Type | Cost |
|---|---|
| Storyboard image (T2I) | 2 credits per run |
| Video 720p (I2V) | About 1.5 credits per second |
| Video 1080p (I2V) | About 2.5 credits per second |
4. Emotion and previous-frame continuity
Storyboard emotion tags
Previous-frame continuity
- Mode A: use the last frame of the previous video shot as the starting reference for the next shot
- Mode B: use the last frame of the previous shot as the continuity reference for the next image
✅ Enable previous-frame continuity for consecutive shots in the same scene. Turn it off when you intentionally switch to a new space.
5. Duration and pacing control
| Shot type | Recommended duration | Note |
|---|---|---|
| Dialogue shot | 3-5 seconds | Adjust based on line length |
| Emotional close-up | 2-4 seconds | Give viewers room to absorb emotion |
| Action shot | 2-3 seconds | Better for fast momentum |
| Environment setup | 3-5 seconds | Useful for openings and transitions |
30-second pacing template
| Time range | Content | Purpose |
|---|---|---|
| 0-3s | Conflict or suspense | Capture attention immediately |
| 3-10s | Context setup | Build the situation |
| 10-20s | Conflict escalation | Push the plot forward |
| 20-27s | Climactic release | Reach the emotional peak |
| 27-30s | Reversal or payoff | Leave a memorable final beat |
6. Transition effects
| Effect | Description | Best for |
|---|---|---|
| No transition | Hard cut | Fast-cut editing and action-heavy rhythm |
| Fade | Opacity blend | Lyricism, time passing, memory |
| Wipe right | Left-to-right push | Story movement and space transition |
| Wipe left | Right-to-left push | Flashback, reversal, directional contrast |
| Perlin | Organic texture dissolve | Dreamlike, fantasy, and time-slip scenes |
✅ Try to keep transition language consistent within one short drama instead of changing the style every shot.
7. Storyboard adjustment techniques
- Reorder shots through drag-and-drop or step movement
- Adjust shot duration based on dialogue and emotional need
- When adding or deleting shots, re-check continuity immediately
- Refine descriptions and prompts to improve visual precision
⚠️ Deleting a shot also removes the linked image, video, and audio resources. Reordering shots means you should re-check continuity.
8. AI optimization panel
Overview
Overall quality score, optimization suggestions, and a one-click apply path.
Pacing analysis
Shows speed distribution, average duration, and where rhythm shifts happen.
Shot recommendation
Compares the current shot with AI-recommended alternatives and explains why.
Style analysis
Checks consistency, shot distribution, and the main style risks.
9. Shot-level audio functions
- Dialogue supports multi-character formatting and maps automatically to the cast
- You can generate voice from configured character voices or upload custom files
- BGM can be created from prompts or uploaded manually
- Sound effects can be generated as single entries or scheduled on a time axis
- Different audio categories pause each other automatically during preview to avoid overlap noise
10. Audio mixing panel
- Independent volume control for dialogue, music, and effects
- Preset mix profiles can be applied in one click
- In most cases, dialogue should stay louder than music, and music louder than effects
✅ In most cases, dialogue should stay louder than music, and music louder than effects
11. Draft version history
- The system keeps recent versions automatically so failed edits can be rolled back
- You can create manual snapshots before important changes
- Version lists show timestamps, shot counts, and key labels
- Any version can be restored to replace the current state
12. Team collaboration
- Invite collaborators by email
- Assign viewing, editing, and management permissions
- Edit locks reduce concurrent editing conflicts
- A single shot can jump directly into the AV editor for detailed refinement
FAQ
Q: How should shot count relate to video runtime?
A useful estimate is 5 or 10 seconds per shot. A 30-second piece often lands around 3 to 6 shots, and faster rhythm usually requires more cuts.
Q: Do storyboard images and storyboard videos use the same model?
No. Image generation and video generation use separate model selectors, so they should be chosen independently.
Q: How do I make consecutive shots feel more coherent?
Use previous-frame continuity whenever possible and avoid abrupt breaks in space, character state, or emotion between neighboring shots.
Q: Do I have to accept every suggestion from the AI optimization panel?
No. It is better treated as assistant-level direction. You can review and apply suggestions selectively.