Different between Google Speech API and Web Speech API

The Web Speech API is a W3C supported specification that allows browser vendors to supply a speech recognition engine of their choosing (be it local or cloud-based) that backs an API you can use directly from the browser without having to worry about API limits and the like. You could imagine that Apple might power this with Siri and Microsoft might power this with Cortana. Again, browser vendors could opt to use the built in dictation software in the operating system, but that doesn't seem to currently be the trend. If your trying to perform simple speech synthesis in a browser (e.g. voice commands), this is likely the best path to take, especially as adoption grows.

The Google Speech API is a cloud-based solution that allows you to use Google's speech software outside of a browser. It also provides broader language support and can transcribe longer audio files. If you have a 20min audio recording you want to transcribe, this would be the path to take. As of the time of this writing, Google charges $0.006 for every 15s recorded after the first hour for this service.


The Web API is REST based API with API key authentication, especially for web pages which needs a a simple feature set.

While Google Speech API basically is a gRPC API with various authentication method. There are lot feature is available when you use gRPC, like authentication, faster calling, and streaming!!!