data:image/s3,"s3://crabby-images/17f36/17f369f3eac842904972f99ac64c90486a6246dd" alt="Speech to text api example"
The play action seems at first sight an odd place to feature Speech Recognition. Prompt:"Would you like to speak to Sales, Marketing or Support?".Here, a file or TTS prompt is played, and the user's response, which must be one of a set of specified words or short phrases, is passed to the selected next_page. The run_speech_menu action, being somewhat more restricted, is ideal for menu driven applications.
data:image/s3,"s3://crabby-images/397ff/397fff288cba0c69220af896e74a1a6ac09f7b33" alt="speech to text api example speech to text api example"
This allows you to play a file or TTS prompt, then receive a transcription of the user's response passed to your next_page. In most applications, the main action used to drive a conversation with the user is get_input. See Speech Recognition Languages to see if the enhanced models are available for your language. These models have been optimized to more accurately recognise audio data from these specific sources. Google have made enhanced models available for some languages, for specific sources (e.g. Recognition accuracy can be improved by using the specialized model that relates to the kind of audio data being analysed.įor example, the phone_call model used on audio data recorded from a phone call will produce more accurate transcription results than the default, command_and_search, or video models. Google Speech-to-Text defines a number of models that have been trained from millions of examples of audio from specific sources, for example phone calls or videos. For the up to date list, see Speech Recognition Languages.
data:image/s3,"s3://crabby-images/33fb3/33fb314223a8ad3de9ec229ef7a2a0433e05c1d8" alt="speech to text api example speech to text api example"
LanguagesĬurrently, our Speech Recognition supports 120 languages and language variants. It may be accessed using the get_input, play, run_speech_menu, start_transcription and stop_transcription actions. Our Speech Recognition is available for REST applications only, and requires REST API v2.
#Speech to text api example how to#
This gives you much flexibility in how to drive the conversation, including the use of AI driven chatbots. In combination with Text To Speech (TTS), our Speech Recognition allows your application to present a natural, conversational interface to the user. You can optionally provide hint information to adapt the recogniser to words or phrases which are more likely to be said. In practice, this means you tell it what language it'll be hearing then it will do its best to transcribe whatever you say to it. Aculab Cloud uses Google Speech-to-Text, a multilingual natural language speech recogniser powered by machine learning.
data:image/s3,"s3://crabby-images/17f36/17f369f3eac842904972f99ac64c90486a6246dd" alt="Speech to text api example"