How to play AudioStream response in AWS Polly using JavaScript SDK?

Elliott's Chatty Kathy code worked beautifully for me, but there are two separate issues with Safari and mobile.

Safari: When creating the blob, the content type MUST be specified:

var blob = new Blob([arrayBuffer], {type: 'audio/mpeg'});
url = webkitURL.createObjectURL(blob);

Mobile: The above must be true, plus playback needs to be initiated by a user touch event. Note: Older iOS versions seem to require that playback be initiated in the same thread as the touch event, so a touch event that initiates a promise chain that eventually calls audio.play() will fail. Later iOS versions seem to be smarter about this.


Using the Web Audio API:

const result = await polly.synthesizeSpeech(params).promise();

const aContext = new AudioContext();

const source = aContext.createBufferSource();
source.buffer = await aContext.decodeAudioData(result.AudioStream.buffer);
source.connect(aContext.destination);
source.start();

Docs:

  • AudioContext
  • Decode ArrayBuffer

 var uInt8Array = new Uint8Array(audioStream);
 var arrayBuffer = uInt8Array.buffer;
 var blob = new Blob([arrayBuffer]);
 var url = URL.createObjectURL(blob);

 audioElement.src = url;
 audioElement.play();

I created a Javascript library called ChattyKathy that will handle the entire process for you if you want to take the easy way out.

Just pass it an AWS Credentials object and then tell her what to say. She'll call AWS, transform the response, and play the audio.

var settings = {
    awsCredentials: awsCredentials,
    awsRegion: "us-west-2",
    pollyVoiceId: "Justin",
    cacheSpeech: true
}

var kathy = ChattyKathy(settings);

kathy.Speak("Hello world, my name is Kathy!");
kathy.Speak("I can be used for an amazing user experience!");