What is the difference between AV_SAMPLE_FMT_S16P and AV_SAMPLE_FMT_S16?

AV_SAMPLE_FMT_S16P is planar signed 16 bit audio, i.e. 2 bytes for each sample which is same for AV_SAMPLE_FMT_S16.

The only difference is in AV_SAMPLE_FMT_S16 samples of each channel are interleaved i.e. if you have two channel audio then the samples buffer will look like

c1 c2 c1 c2 c1 c2 c1 c2...

where c1 is a sample for channel1 and c2 is sample for channel2.

while for one frame of planar audio you will have something like

c1 c1 c1 c1 .... c2 c2 c2 c2 ..

now how is it stored in AVFrame:

  • for planar audio:

data[i] will contain the data of channel i (assuming channel 0 is first channel).

however if you have more channels than 8, then data for rest of the channels can be found in extended_data attribute of AVFrame.

  • for non-planar audio

data[0] will contain the data for all channels in an interleaved manner.