Advances in brain-computer interface allow paralyzed patients to speak almost in real time.

Home · AI Blog · Basic concepts · Advances in brain-computer interface allow paralyzed patients to speak almost in real time.

In an exciting advance in the field of brain-computer interface (BCI), scientists from UC Berkeley and UC San Francisco have made significant progress in enabling people with severe paralysis to communicate through a system that converts brain signals into speech almost in real-time. This development addresses one of the biggest challenges in speech neuroprosthetics: the response time between thought and verbal expression.

The technology, which has been published in Nature Neuroscience, uses artificial intelligence to decode brain signals and produce words almost instantaneously, providing a fluency in expression that allows users to communicate continuously, without significant pauses. This study was funded by the National Institute on Deafness and Other Communication Disorders (NIDCD) of the National Institutes of Health (NIH).

A system that transforms thoughts into words

“Our transmission system uses algorithms similar to those employed by devices like Alexa or Siri to decode brain signals and produce speech almost as quickly as it is thought,” commented Gopala Anumanchipalli, co-principal investigator and assistant professor at UC Berkeley. This is the first system that has managed to synthesize fluid and continuous speech directly from neural data.

This innovative system is versatile and works with different devices, using non-invasive methods that measure facial muscle activity through sensors on the skin, as well as more complex systems involving electrodes placed directly in the brain. Kaylo Littlejohn, a doctoral student and co-author of the study, mentioned that the algorithm can adapt to various brain monitoring configurations, as long as it has access to reliable signals.

The neuroprosthesis converts the neural activity of the motor cortex, which controls speech, into words, once the person has formed the thought and is ready to move the vocal muscles. To train the system, a participant attempted to speak silently while researchers recorded their brain activity. AI models filled in the missing details, such as sound patterns, to generate the speech.

A notable aspect of this advance is that the team used the participant’s voice before their injury as a reference, ensuring that the outcome sounded familiar and personal. Previously, studies showed an 8-second delay in decoding complete phrases, but the new method achieves audible speech in less than a second. This faster response is accompanied by high accuracy, demonstrating that real-time transmission is possible without sacrificing quality.

To assess flexibility, the researchers synthesized rare words that were not part of the system’s training set, such as those from the NATO phonetic alphabet (“Alpha,” “Bravo,” etc.). The technology worked effectively, indicating its potential for broader vocabulary use.

Edward Chang, senior researcher and neurosurgeon at UCSF, emphasized the real-world applications. “This innovation brings us closer to practical brain-computer interfaces that can significantly improve communication for those with severe speech disabilities,” he stated.

In future research, efforts will focus on enhancing the emotional tone and expressiveness of speech, aiming to reflect changes in tone, volume, and emotion, making the output more realistic. With further refinements, this technology could transform communication options for people who cannot speak.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *