Recording synthesized text-to-speech to a file in Python

You can call espeak with the -w argument using subprocess.

import subprocess

def textToWav(text,file_name):
   subprocess.call(["espeak", "-w"+file_name+".wav", text])

textToWav('hello world','hello')

This will write file_name.wav without reading out loud. If your text is in a file (e.g. text.txt) you need to call espeak with the -f parameter ("-f"+text). I'd recommend reading the espeak man pages to see all the options you have.

Hope this helps.


You can use more advanced SAPI wrapper to save output to the wav file. For example you can try

https://github.com/DeepHorizons/tts

The code should look like this:

import tts.sapi
voice = tts.sapi.Sapi()
voice.set_voice("Joey")
voice.create_recording('hello.wav', "Hello")

Here is an example which gives you access to the NSSpeechSynthesizer API

#!/usr/bin/env python

from  AppKit import NSSpeechSynthesizer
import sys
import Foundation


if len(sys.argv) < 2:
   text = raw_input('type text to speak> ')
else:
   text = sys.argv[1]

nssp = NSSpeechSynthesizer
ve = nssp.alloc().init()
ve.setRate_(100)
url = Foundation.NSURL.fileURLWithPath_('yourpath/test.aiff')
ve.startSpeakingString_toURL_(text,url)