Speech to Text Conversion Using JavaScript

1
2572
Speech to Text Conversion

Speech recognition is a feature that gives us the ability to perform tasks using our spoken words as input. Speech recognition is gradually becoming a part of our lives in the form of voice assistants such as Alexa, Google Assistant, and Siri. Whether it’s dictating words to your device to compose a document, doing a web search using voice, or controlling your computer using speech — speech to text conversion is making our life faster and comfortable. It has the potential to replace traditional forms of human to machine interface input devices, such as keyboards. A future where humans are able to interact with machines just by using their speech and bodily movements is not very far.

The Web Speech API

The Web Speech API can perform two types of functions:

Speech recognition (speech to text): this feature checks for words and phrases in the speech input and provides the identified words as output text.

Speech synthesis (text to speech): this feature synthesizes text and converts it into speech.

A basic web application for speech to text conversion using JavaScript:

Like any other web app, we need an application having the following files in its directory:

• The index.html file which contains the HTML code for the web app
• The style.css which contains the CSS styles used in the web app
• The index.js file containing the JavaScript code of the web app
A web server for running the web app

Web server for chrome

Speech recognition can be implemented in the browser using JavaScript Web Speech API. The Web Speech API enables the web app to accept speech as input through the device’s microphone and convert the speech into text by matching the words in the speech against the words in its vocabulary.

Along with SpeechRecognition API, a number of closely related APIs are used for displaying results, grammar, etc. These results can then be used as input by other APIs for performing tasks.

Speech to text demo app is being used as an example here. The user just has to tap the start button on the screen and say the keyword and the webpage will display the word in the text.

Demo Web App

JS speech to text
index.html:

style.css:

script.js:

Line by Line Explanation of the Javascript Code

Currently, the Web Speech API is only fully supported by Chrome for desktop and Chrome for Android. The Speech Recognition interface exists in the Chrome browser’s window object as webkitSpeechRecognition.

Speech recognition

Here we created an instantiation of the speech recognition interface.

This will hold the text for display after the speech is converted to text.

This tells the interface that the speech is considered to be continuous, the speech to text conversion should be done instantaneously and pauses in speech are to be ignored.

The event onresult holds all the values of speech converted to text so far but as we go on displaying, we only display the current word. So the current word is extracted into the variable transcript and appended to the content of the content to be displayed.

This will start the speech listening on the button click.

Conclusion

The speech recognition feature in its current form is free to use, highly developed, and gives reasonably accurate results. It needs better adaptation and more devices and browsers to support it for wider acceptance. There is a lot of open source development happening in this field with newer use cases being envisioned for proper adoption. Lack of standardization of speech recognition libraries and browsers needing to seek user permission for listening to microphone input due to privacy concern is also holding it back.

There are a lot of developments happening in terms of speech recognition. Voice assistance with machine learning is even being used to mimic human speech. There are also projects being undertaken to create universal translators which will be trained to take speech in any human language as input and translate it into words of another language as per the user’s preference.

Also, if you feel passionate about learning Machine Learning to explore the wonderful scope it holds in the future, you can try Machine Learning For Absolute Beginners online tutorial.

1 COMMENT

  1. Sir after applying this code, if suppose I say “fullStop”then it is should be like” . ” but why it give output in text fullstop ?

LEAVE A REPLY

Please enter your comment!
Please enter your name here