Google speech API throws Invalid audio channel count

Audio recorded on a Mac is most likely stereo, but currently the API seems to only support 1-channel (mono) audio. From the Audio Encoding section of the docs:

Audio encoding of the data sent in the audio message. All encodings support only 1 channel (mono) audio.

The simplest solution here might be to just convert your sample to mono using something like Audacity.


Multi-channel is now supported in Google Cloud, however I still hit this issue because I used a stereo audio file and the sample documentation does not specify the channel count (audioChannelCount). You can do this with the following, as documented in https://cloud.google.com/speech-to-text/docs/multi-channel

const config = {
  encoding: `LINEAR16`,
  languageCode: `en-US`,
  audioChannelCount: 2,
  enableSeparateRecognitionPerChannel: true,
};