Even behaviorists who are particularly good at interpreting dog vocalizations aren’t able to understand all the nuances — but that could change thanks to machine learning. A team of researchers is currently developing artificial intelligence-based tools that can help us better understand what man’s best friend is trying to tell us.
This is not the first time that researchers have explored this concept. But so far they have all encountered the same problem: lack of data. In fact, all language processing models must be trained using real-world examples. There is no way an algorithm can determine the meaning of a sequence of sounds from scratch; It is imperative to give him references.
However, while obtaining this type of resource is very easy for humans, it is much more complicated for animals. “ Logistically, animal vocalizations are much more difficult to request and record “, explains Artem Abzaliev, lead author of the study.
A model designed for people
To overcome this obstacle, a team from the University of Michigan relied on another rather original approach: reuse a model originally developed for human speech. ” Using language processing models originally trained on human speech, we opened a window into the nuances of dog barking », explains his co-author Rada Mihalcea.
Thanks to this approach, the team was able to start building their project on an already solid foundation, as these systems have become relatively sophisticated in recent years. There are already tons of models capable of distinguishing nuances of timbre, intonation, or accent. Some are even able to recognize the emotions (frustration, gratitude, disgust, etc.) that come through in an audio recording. ” These models are able to learn to encode the incredibly complex patterns of human language, and we wanted to see if we could harness these abilities to interpret dog barks. “, explains Abzaliev.
His team therefore started with Wav2Vec2, a model designed for humans, and presented him with a dataset consisting of recordings from 74 dogs. They came from animals of different breeds, ages and sexes, and were collected in many different contexts (play, detection of a disturbing element such as a small animal, defense reflex, social interactions, etc.). Using this data, Abzaliev was able to change the importance of the connections connecting the network’s artificial neurons (the weights) as well as the biases that control them.
Encouraging results
At the end of the process, the team was able to generate representations of the acoustic data collected from the dogs and interpret them. Analyzing the results, they found that the model had classified the recordings in the correct category (play, anxiety, attention seeking, pain, frustration, etc.) i 70% of cases.
A result that is still quite approximate, but much higher than what models trained solely on animal footage are capable of. “ It is the first time that techniques optimized for human speech have participated in the decoding of animal communication. », Mihalcea rejoices.
A resource for researchers?
Beyond the raw results, these results have a very interesting implication: they prove that the sounds and patterns inherent in human language can serve as a basis for analyzing the vocalizations of dogs and perhaps even other species. Once this system is mature, it can therefore become a very interesting tool, especially fors ethologists.
These researchers, who specialize in the study of animal behavior, often rely on vocalizations to study interactions within groups, behavioral traits and even cognitive abilities of their preferred species. Therefore, a tool like this could help them identify nuances that might otherwise have been missed…or simply anticipated. For example, imagine a team of specialists studying primates in a difficult environment, such as a jungle. Instead of spending a long time sifting through audio recordings to classify vocalizations and assign them a specific type of behavior, they could hand this task over to an AI model to identify interesting relationships and trends much faster.
The researchers do not address this theme at all in their paper, but by extrapolation we can also imagine that a generative AI system could one day make it possible to synthesize sounds that are specifically calibrated to convey a very specific message to an animal . For now, this is still pure science fiction. But Pmaybe one day artificial intelligence will finally allow us to “chat” with our faithful companions, to understand whale songs or learn more about why killer whales attack boats, for example.
The text of the study is available here.
🟣 To not miss any news on Journal du Geek, subscribe to Google News. And if you love us, we have a newsletter every morning.