Merging multiple videos in a template/layout with Python FFMPEG?

enter image description here

ffmpeg -i left.jpg -i video.mp4 -i right.png -i logo.png -filter_complex "[0]scale=(1920-1080*($width/$height))/2:1080:force_original_aspect_ratio=increase,crop=(1920-1080*($width/$height))/2:1080[left];[1]scale=-2:1080[main];[2]scale=(1920-1080*($width/$height))/2:1080:force_original_aspect_ratio=increase,crop=(1920-1080*($width/$height))/2:1080[right];[left][main][right]hstack=inputs=3[bg];[bg][3]overlay=5:main_h-overlay_h-5:format=auto,drawtext=textfile=text.txt:reload=1:x=w-tw-10:y=h-th-10,format=yuv420p" -movflags +faststart output.mp4

What this does is scale video.mp4 so it is 1080 pixels high and autoscale the width. left.jpg and right.png are scaled to take up the remainder so the result is 1920x1080.

This is a simple example and won't work for all inputs such as if the video.mp4 autoscaled width is bigger than 1920, but you can deal with that by using arithmetic expressions.

$width and $height refer to the size of video.mp4. See Getting video dimension / resolution / width x height from ffmpeg for a Python friendly method using JSON or XML.

See documentation for scale, crop, hstack, drawtext, overlay, and format filters.

A much simpler method is to just add colored bars instead of arbitrary images. See Resizing videos with ffmpeg to fit a specific size.

Change text on the fly by atomically updating text.txt. Or use the subtitles filter instead of drawtext if you want the text to change on certain timestamps (you did not specify what you needed).

Related answers:

  • How to position overlay/watermark/logo with ffmpeg?
  • How to position drawtext text?
  • Vertically or horizontally stack (mosaic) several videos using ffmpeg?
  • Getting video dimension / resolution / width x height from ffmpeg

I've done it. The code can be used as a command line program or as a module. To find out more about the command line usage, call it with the --help option. For module usage, import the make_video fucntion in your code (or copy-paste it), and pass the appropriate arguments to it. I have included a screenshot of what my script produced with some sample material, and of course, the code. Code:

#!/usr/bin/python3
#-*-coding: utf-8-*-

import sys, argparse, ffmpeg, os

def make_video(video, left_boarder, right_boarder, picture_file, picture_pos, text_file, text_pos, output):
    videoprobe = ffmpeg.probe(video)
    videowidth, videoheight = videoprobe["streams"][0]["width"], videoprobe["streams"][0]["height"] # get width of main video
    scale = (1080 / videoheight)
    videowidth *= scale
    videoheight *= scale
    videostr = ffmpeg.input(video) # open main video
    audiostr = videostr.audio
    videostr = ffmpeg.filter(videostr, "scale", "%dx%d" %(videowidth, videoheight))
    videostr = ffmpeg.filter(videostr, "pad", 1920, 1080, "(ow-iw)/2", "(oh-ih)/2")
    videostr = videostr.split()
    boarderwidth = (1920 - videowidth) / 2 # calculate width of boarders
    left_boarderstr = ffmpeg.input(left_boarder) # open left boarder and scale it
    left_boarderstr = ffmpeg.filter(left_boarderstr, "scale", "%dx%d" % (boarderwidth, 1080))
    right_boarderstr = ffmpeg.input(right_boarder) # open right boarder
    right_boarderstr = ffmpeg.filter(right_boarderstr, "scale", "%dx%d" % (boarderwidth, 1080))
    picturewidth = boarderwidth - 100 # calculate width of picture
    pictureheight = (1080 / 3) - 100 # calculate height of picture
    picturestr = ffmpeg.input(picture_file) # open picture and scale it (there will be a padding of 100 px around it)
    picturestr = ffmpeg.filter(picturestr, "scale", "%dx%d" % (picturewidth, pictureheight))
    videostr = ffmpeg.overlay(videostr[0], left_boarderstr, x=0, y=0) # add left boarder
    videostr = ffmpeg.overlay(videostr, right_boarderstr, x=boarderwidth + videowidth, y=0) #add right boarder
    picture_y = (((1080 / 3) * 2) + 50) # calculate picture y position for bottom alignment
    if picture_pos == "top":
        picture_y = 50
    elif picture_pos == "center":
        picture_y = (1080 / 3) + 50
    videostr = ffmpeg.overlay(videostr, picturestr, x=50, y=picture_y)
    text_x = (1920 - boarderwidth) + 50
    text_y = ((1080 / 3) * 2) + 50
    if text_pos == "center":
        text_y = (1080 / 3) + 50
    elif text_pos == "top":
        text_y = 50
    videostr = ffmpeg.drawtext(videostr, textfile=text_file, reload=1, x=text_x, y=text_y, fontsize=50)
    videostr = ffmpeg.output(videostr, audiostr, output)
    ffmpeg.run(videostr)

def main():
    #create ArgumentParser and add options to it
    argp = argparse.ArgumentParser(prog="ffmpeg-template")
    argp.add_argument("--videos", help="paths to main videos (default: video.mp4)", nargs="*", default="video.mp4")
    argp.add_argument("--left-boarders", help="paths to images for left boarders (default: left_boarder.png)", nargs="*", default="left_boarder.png")
    argp.add_argument("--right-boarders", help="paths to images for right boarders (default: right_boarder.png)", nargs="*", default="right_boarder.png")
    argp.add_argument("--picture-files", nargs="*", help="paths to pictures (default: picture.png)",  default="picture.png")
    argp.add_argument("--picture-pos", help="where to put the pictures (default: bottom)", choices=["top", "center", "bottom"], default="bottom")
    argp.add_argument("--text-files", nargs="*", help="paths to files with text (default: text.txt)", default="text.txt")
    argp.add_argument("--text-pos", help="where to put the texts (default: bottom)", choices=["top", "center", "bottom"], default="bottom")
    argp.add_argument("--outputs", nargs="*", help="paths to outputfiles (default: out.mp4)", default="out.mp4")
    args = argp.parse_args()
    # if only one file was provided, put it into a list (else, later, every letter of the filename will be treated as a filename)
    if type(args.videos) == str:
        args.videos = [args.videos]
    if type(args.left_boarders) == str:
        args.left_boarders = [args.left_boarders]
    if type(args.right_boarders) == str:
        args.right_boarders = [args.right_boarders]
    if type(args.picture_files) == str:
        args.picture_files = [args.picture_files]
    if type(args.text_files) == str:
        args.text_files = [args.text_files]
    if type(args.outputs) == str:
        args.outputs = [args.outputs]

    for i in (range(0, min(len(args.videos), len(args.left_boarders), len(args.right_boarders), len(args.picture_files), len(args.text_files), len(args.outputs))) or [0]):
        print("Info : merging video %s, boarders %s %s, picture %s and textfile %s into %s" % (args.videos[i], args.left_boarders[i], args.right_boarders[i], args.picture_files[i], args.text_files[i], args.outputs[i]))
        # check if all files provided with the options exist
        if not os.path.isfile(args.videos[i]):
            print("Error : video %s was not found" % args.videos[i])
            continue
        if not os.path.isfile(args.left_boarders[i]):
            print("Error : left boarder %s was not found" % args.left_boarders[i])
            continue
        if not os.path.isfile(args.right_boarders[i]):
            print("Error : rightt boarder %s was not found" % args.right_boarders[i])
            continue
        if not os.path.isfile(args.picture_files[i]):
            print("Error : picture %s was not found" % args.picture_files[i])
            continue
        if not os.path.isfile(args.text_files[i]):
            print("Error : textfile %s was not found" % args.text_files[i])
            continue
        try:
            make_video(args.videos[i], args.left_boarders[i], args.right_boarders[i], args.picture_files[i], args.picture_pos, args.text_files[i], args.text_pos, args.outputs[i])
        except Exception as e:
            print(e)

if __name__ == "__main__":
    main()

Example for direct usage as a script:

$ ./ffmpeg-template --videos input1.mp4 inout2.mp4 --left-boarders left_boarder1.png left_boarder2.png --right-boarders right_boarder1.png right_boarder2.png --picture-files picture1.png picture2.png --text-files text1.txt text2.png --outputs out1.mp4 out2.mp4 --picture-pos bottom --text-pos bottom

As values for the options i took the defaults. If you omit the options, these defaults will be used, and if one of the files is not found, an error message will be displayed.

Image: enter image description here