Audio Mixing and Sync
Take your screencast audio from passable to professional. This lesson covers ducking, multi-track sync, and meeting YouTube's loudness standards.
A great screencast audio mix has three layers working together: voice is always intelligible, system audio provides context, and music fills the background without competing. Volume automation (ducking) is the key technique.
Audio Ducking (Auto-Volume)
Ducking automatically lowers background music when voice is detected:
Manual ducking (recommended for screencasts):
1. Listen to your edit and note where voice is present (take notes)
2. In Audio Mixer → enable keyframes on A3 (music track)
3. Add volume keyframes:
- Where voice starts: lower music to -25 dB
- Where voice pauses/ends: raise back to -15 dB
- At natural breaks: full music moment (for feel)
Shortcut approach:
Add an "Audio Compressor" or "SC4 Compressor" effect
→ Configure it to sidechain music to voice (advanced)
Typical ducking map for a 2-minute screencast:
0:00 Intro music at -12 dB (no voice)
0:05 Voice starts → music ducks to -22 dB
1:45 Voice ends → music rises to -12 dB
2:00 Fade to silence → music fades out
Loudness Standards for YouTube
YouTube targets -14 LUFS integrated loudness. If your audio is louder, YouTube turns it down. If quieter, it stays quiet.
Target for YouTube uploads:
Integrated loudness: -14 LUFS
True peak: -1 dBTP
Loudness range: 7–8 LU
To normalize to -14 LUFS in Kdenlive:
1. Select voice clip (A2) → Effects → "Audio Loudness Normalization"
2. Set: Target LUFS = -14
3. Apply to all voice clips (or apply at track level)
Alternatively, use ffmpeg after render:
ffmpeg -i output.mp4 -af loudnorm=I=-14:TP=-1:LRA=7 final.mp4
Synchronizing Separate Recordings
When mic is recorded separately from screen capture:
Step 1: The clap method
Before recording, clap loudly in front of both camera (if any) and mic
This creates a sharp audio spike in both tracks
Step 2: Import both tracks to Kdenlive
Screen capture → V1 / A1
Separate mic recording → A2
Step 3: Align by waveform
Open Audio Mixer → solo A1 → find the clap spike in waveform
Solo A2 → find the clap spike in waveform
Drag A2 clip until both spikes are aligned
Step 4: Group and lock
Select V1 clip + A2 clip → Ctrl+G (group)
This keeps them in sync during edits
Step 5: Mute A1 (system audio) if mic is better quality
Or keep both: mic for voice, system for typing/click sounds
Multi-Track Audio Layout
For complex screencast productions:
A1 [Screen recording system audio] -6 dB (reference)
A2 [Primary microphone voice] -3 dB (dominant)
A3 [Secondary mic or phone audio] -6 dB (alternate)
A4 [Background music] -15 dB (atmospheric)
A5 [SFX — click sounds, notifications] -12 dB (optional)
Use as few tracks as needed. Most screencasts work perfectly with just A1 (system) and A2 (mic). Add A3 only if you have music.
Monitoring Audio During Playback
Play button in Audio Mixer: Click S (Solo) on a track → hear only that track
Mute in Audio Mixer: Click M → silences that track in playback
VU meters: Watch for red peaks → reduce fader or normalize
Good level indicators:
- Green zone: Safe, good headroom
- Yellow zone: Nominal — acceptable but watch for peaks
- Red zone: Clipping — reduce level immediately
Fixing Common Audio Issues
| Issue | Fix |
|---|---|
| Voice sounds too quiet | Normalize A2 to -3 dB OR raise A2 fader |
| Music overwhelms voice | Lower A3 fader to -20 dB OR add ducking keyframes |
| Echo/room reverb | Apply "Noise Suppressor" or record with better room treatment |
| Hum or buzz noise | Apply "Notch Filter" at 50Hz or 60Hz (depending on country) |
| Clicks/pops in recording | Apply "Declick" or cut the frame with X then fill with Ctrl+Z |
| Audio and video out of sync | Re-align using waveform clap method, then group |
Checking Audio Before Export
Final pre-render audio checklist:
☐ Voice is clear at -3 to 0 dB
☐ Music is under voice at -15 dB or below
☐ No red peaks in any track
☐ Fade in at very start (prevent abrupt start)
☐ Fade out at very end (prevent abrupt cutoff)
☐ All muted/disabled tracks re-enabled if intended for output
☐ Normalize effect applied or loudness target verified
Audio Export Settings
Set correct audio in the Render dialog:
Ctrl+Enter → Render dialog
Audio settings:
Codec: AAC (MP4 export for YouTube)
Sample rate: 48000 Hz (standard for video)
Channels: Stereo (2 channels)
Bitrate: 320 kbps (high quality) or use VBR q=2
Keyboard Reference
| Action | Shortcut |
|---|---|
| Mute track | Click [M] in Audio Mixer |
| Solo track | Click [S] in Audio Mixer |
| Toggle mute (active track) | Ctrl+Shift+H |
| Play / Stop | Space |
| JKL transport | J / K / L |
| Open Render | Ctrl+Enter |
Hands-On Practice
1. Open project with V1/A1 (screen recording) and A3 (music)
2. Add volume keyframes to A3:
Audio Mixer → A3 keyframe mode on
At 0:05 → -15 dB (before voice)
At 0:06 → -25 dB (voice starts → duck)
At 1:45 → -25 dB (hold)
At 1:46 → -15 dB (voice ends → restore)
3. Play through and listen
Adjust keyframe values if music is still too loud during voice
4. Apply loudness normalization to A2 voice track:
Click ⚡ on A2 → add "Audio Loudness Normalization"
Set -14 LUFS
5. Solo each track to check for noise issues
Apply "Noise Suppressor" if hiss is noticeable
6. Final check: Play from start to end
Watch VU meters → no red peaks
7. Ctrl+S → save → ready to render