MIT: Automatic Speech Recognition

Smart Blog

MIT: Automatic Speech Recognition

Researchers at MIT’s Microsystems Technology Laboratories have built a low-power chip specialized for automatic speech recognition. With power savings of 90 to 99 percent, it could make voice control practical for relatively simple electronic devices. Whereas a cellphone running speech-recognition software might require about 1 watt of power, the new chip requires between 0.2 and 10 milliwatts, depending on the number of words it has to recognize.

Speech input will become a natural interface for many wearable applications and intelligent devices,

says Anantha Chandrakasan, the Vannevar Bush Professor of Electrical Engineering and Computer Science at MIT, whose group developed the new chip.

The miniaturization of these devices will require a different interface than touch or keyboard. It will be critical to embed the speech functionality locally to save system energy consumption compared to performing this operation in the cloud.

Today, the best-performing speech recognizers are, like many other state-of-the-art artificial-intelligence systems, based on neural networks, virtual networks of simple information processors roughly modeled on the human brain. Much of the new chip’s circuitry is concerned with implementing speech-recognition networks as efficiently as possible.
But even the most power-efficient speech recognition system would quickly drain a device’s battery if it ran without interruption. So the chip also includes a simpler “voice activity detection” circuit that monitors ambient noise to determine whether it might be speech. If the answer is yes, the chip fires up the larger, more complex speech-recognition circuit.

For the next generation of mobile and wearable devices, it is crucial to enable speech recognition at ultralow power consumption,

says Marian Verhelst, a professor of microelectronics at the Catholic University of Leuven in Belgium.

This is because there is a clear trend toward smaller-form-factor devices, such as watches, earbuds, or glasses, requiring a user interface which can no longer rely on touch screen. Speech offers a very natural way to interface with such devices.

Author: Tim Cole
Image Credit: Pixabay

Leave a Reply

Your email address will not be published. Required fields are marked *

*