How to overlay/downmix two audio files using ffmpeg

Check this out:

ffmpeg -y -i ad_sound/whistle.mp3 -i ad_sound/4s.wav -filter_complex "[0:0][1:0] amix=inputs=2:duration=longest" -c:a libmp3lame ad_sound/outputnow.mp3

I think it will help.


stereo + stereo → stereo

Normal downmix

Normal downmix

Use the amix filter:

ffmpeg -i input0.mp3 -i input1.mp3 -filter_complex amix=inputs=2:duration=longest output.mp3

Or the amerge filter:

ffmpeg -i input0.mp3 -i input1.mp3 -filter_complex amerge=inputs=2 -ac 2 output.mp3

Downmix each input into specific output channel

Downmix each input into specific output channel

Use the amerge and pan filters:

ffmpeg -i input0.mp3 -i input1.mp3 -filter_complex "amerge=inputs=2,pan=stereo|c0<c0+c1|c1<c2+c3" output.mp3

mono + mono → stereo

mono + mono → stereo

Use the join filter:

ffmpeg -i input0.mp3 -i input1.mp3 -filter_complex join=inputs=2:channel_layout=stereo output.mp3

Or amerge:

ffmpeg -i input0.mp3 -i input1.mp3 -filter_complex amerge=inputs=2 output.mp3

mono + mono → mono

mono + mono → mono

Use the amix filter:

ffmpeg -i input0.mp3 -i input1.mp3 -filter_complex amix=inputs=2:duration=longest output.mp3

More info and examples

See FFmpeg Wiki: Audio Channels


The amix filter helps to mix multiple audio inputs into a single output.

If you run the following command:

ffmpeg -i INPUT1 -i INPUT2 -i INPUT3 -filter_complex amix=inputs=3:duration=first:dropout_transition=3 OUTPUT

This command will mix 3 input audio streams (I used two mp3 files, in the example below) into a single output with the same duration as the first input and a dropout transition time of 3 seconds.

The amix filter accepts the following parameters:

  • inputs: The number of inputs. If unspecified, it defaults to 2.

  • duration: How to determine the end-of-stream.

    • longest: The duration of the longest input. (default)

    • shortest: The duration of the shortest input.

    • first: The duration of the first input.

  • dropout_transition: The transition time, in seconds, for volume renormalization when an input stream ends. The default value is 2 seconds.

For example, I ran the following command in Ubuntu: FFMPEG version: 3.2.1-1 UBUNTU 16.04.1

ffmpeg -i background.mp3 -i bSound.mp3 -filter_complex amix=inputs=2:duration=first:dropout_transition=0 -codec:a libmp3lame -q:a 0 OUTPUT.mp3

-codec:a libmp3lame -q:a 0 was used to set a variable bit rate. Remember that, you need to install the libmp3lame library, if is necessary. But, it will work even without the -codec:a libmp3lame -q:a 0 part.

Reference: https://ffmpeg.org/ffmpeg-filters.html#amix


For merging two audio files with different volumes and different duration following command will work:

ffmpeg -y -i audio1.mp3 -i audio2.mp3 -filter_complex "[0:0]volume=0.09[a];[1:0]volume=1.8[b];[a][b]amix=inputs=2:duration=longest" -c:a libmp3lame output.mp3

Here duration can be change to longest or to shortest, you can also change the volume levels according to your need.

If you're looking to add background music to some voice use the following command as in the gaps the music will become loud automatically:

ffmpeg -i bgmusic.mp3 -i audio.mp3 -filter_complex "[1:a]asplit=2[sc][mix];[0:a][sc]sidechaincompress=threshold=0.003:ratio=20[bg]; [bg][mix]amerge[final]" -map [final] final.mp3

In this threshold is something whose value will decide how much loud the audio should be, the less the threshold more the audio will be. Ratio gives how much the other audio should be compressed, the more the ratio the more the compression is.