Which language to use to write a Speech Recognition software?

My students are using Sphinx. It is written in Java (a port from C++ I believe). It might not be suitable for what you want (I think you would need to create your own dictionary) but worth checking out.


I agree with Pax that this is potentially quite a big project, and that the most practical solution is probably to just licence an existing engine.

If the scope of what you want to do is just distinguish between a few previously known possible utterances, it's a significantly smaller project, but still considerable.

But... if you decide you really really really do want to start developing your own, I can't see a reason not to use Java. The idea that "C is faster" is largely a myth (or based on out-of-date information).


Java may be suited for an interface to it but speech recognition requires seriously raw grunt. I'd be choosing a compiled close-to-the-metal language like C for the actual recognition engine.

This is not something to be undertaken lightly, by the way. There's an awful lot of theory you'll need to learn even before you begin. Myself, I would license one of the existing engines if possible, and concentrate on building a decent product around it.

That's if your intent is to build a product. If you just want to experiment, by all means write your own. It'll be fun (up to a point :-).