Detecting and printing timestamps of periods of silence using SoX

Unfortunately not Sox, but ffmpeg has a silencedetect filter that does exactly what you're looking for:

ffmpeg -i in.wav -af silencedetect=noise=-50dB:d=1 -f null -

(detecting threshold of -50db, for a minimum of 1 seconds, cribbed from the ffmpeg documentation)

...this would print a result like this:

Press [q] to stop, [?] for help
[silencedetect @ 0x7ff2ba5168a0] silence_start: 264.718
[silencedetect @ 0x7ff2ba5168a0] silence_end: 265.744 | silence_duration: 1.02612
size=N/A time=00:04:29.53 bitrate=N/A

There is (currently, at least) no way to make the silence effect output the position where it has detected silence, or to retain all of the silent audio.

If you are able to recompile SoX yourself, you could add an output statement yourself to find out about the cut positions, then use trim in a separate invocation to split the file. With the stock version, you are out of luck.


SoX can easily give you the timestamps of the actual silences in a text file. Not periods of silence though, but you can calculate those with a simple script

   .dat   Text  Data  files.   These  files  contain a textual representation of the sample data.  There is one line at the beginning that contains the sample
          rate, and one line that contains the number of channels.  Subsequent lines contain two or more numeric data intems: the time since the beginning  of
          the first sample and the sample value for each channel.

          Values are normalized so that the maximum and minimum are 1 and -1.  This file format can be used to create data files for external programs such as
          FFT analysers or graph routines.  SoX can also convert a file in this format back into one of the other file formats.

          Example containing only 2 stereo samples of silence:

              ; Sample Rate 8012
              ; Channels 2
                          0   0    0
              0.00012481278   0    0

So you can do sox in.wav out.dat, then parse the text file and consider a silence a sequence of rows with a value close to 0 (depending on your threshold)

Tags:

Audio

Sox