Napalm Automation

All about API

speech recognition

Google has made a gift to third-party developers – and opened access to the programming interfaces of the Cloud Speech API. Access is free for the first time, rates will be announced later.

Speech recognition works for 80 languages. It is possible to recognize speech live through a microphone or audio from files (probably up to 2 minutes). Multiple formats are supported, including FLAC, AMR and PCMU.

It is now possible to embed, for example, voice control for any program via the Cloud Speech API. The system outputs recognized text instantly in the process.

Google says that the Speech API is accurate enough to work even with noisy backgrounds, so the material doesn’t need to be pre-cleaned by processing with filters or using expensive noise-canceling equipment and microphones.

Automatic filtering of unwanted content is supported for some languages.

Rumors about the opening of the interfaces have been circulating for the past few weeks. Experts expressed the opinion that Google is going to enter the market, where now works Nuance and some other companies specializing in speech recognition. Now it will be hard for them to compete with Google, its system uses the latest developments in self-learning neural networks, the same engine used in Google’s voice search and voice typing from Google’s keyboard. With each passing month, the Cloud Speech API will recognize text more and more accurately.

The company announced the news about the Cloud Speech API yesterday at the NEXT conference. In addition to speech recognition, developers now have access to the Cloud Machine Learning platform.

The opening of Google API for speech recognition will hit not only the specialized companies, but also Apple, whose Siri voice assistant is significantly inferior to Google’s neural network in terms of recognition accuracy and functionality.

Related Posts