DolphinGemma, an innovative language model developed by Google, is revolutionizing the way scientists approach the study of dolphin communication. For decades, understanding the clicks and whistles of these intelligent mammals has been a challenge in science. Can you imagine being able not only to hear dolphins but also to understand their conversations?
On National Dolphin Day, Google, in collaboration with researchers from Georgia Tech and the Wild Dolphin Project (WDP), has taken a significant step in the quest for interspecies communication. DolphinGemma is an AI model trained to learn the structure of dolphin vocalizations, generating sound sequences that mimic their communication styles.
Investigating dolphin society for decades
To understand any species, a deep context is needed, and the WDP has provided that since 1985. This underwater research project has been studying Atlantic spotted dolphins in the Bahamas, creating an extensive dataset that includes videos and audio of their behavior.
The WDP’s approach is to observe and analyze the natural communication of dolphins. By being in their environment, researchers can link specific sounds with observed behaviors, looking for patterns that might indicate a language. This long-term analysis is the foundation of their research and provides the necessary context for any analysis performed by the AI.
Introducing DolphinGemma
Analyzing the complex communication of dolphins is a monumental challenge, and the vast labeled dataset from the WDP offers a unique opportunity to utilize cutting-edge AI. DolphinGemma uses Google audio technologies to efficiently represent dolphin sounds, processing this data through a model designed for complex sequences. With approximately 400 million parameters, this model is optimized to run on the Pixel phones used in the field.
This model is based on the insights of Gemma, a collection of state-of-the-art open models from Google. Extensively trained with the WDP’s acoustic database, DolphinGemma functions as an audio input-output model, processing sequences of natural dolphin sounds to identify patterns and predict the sounds that are likely to follow in a sequence.
Using Pixel phones to analyze dolphin sounds
In addition to analyzing natural communication, the WDP is exploring a bidirectional interaction using technology in the ocean. This has led to the development of the CHAT system (Cetacean Hearing Augmentation Telemetry), which seeks to establish a simpler shared vocabulary between researchers and dolphins.
The CHAT system associates synthetic whistles with objects that dolphins enjoy. Through this technique, researchers hope that dolphins will learn to imitate these whistles to request objects. As more natural sounds are understood, these will also be integrated into the system.
Sharing DolphinGemma with the research community
Recognizing the value of collaboration in scientific discovery, there are plans to share DolphinGemma as an open model this summer. Although it has been trained with sounds of Atlantic spotted dolphins, its utility is anticipated for researchers studying other species of cetaceans. This openness will allow for the adaptation and improvement of the model for different vocalizations.
The combination of field research from the WDP, engineering expertise from Georgia Tech, and the power of Google technology is opening new and exciting possibilities. We are on the cusp of not only hearing dolphins but beginning to understand the patterns within their sounds, bringing us a little closer to communication between humans and dolphins.
0 Comments