The code breakers: harnessing the power of AI to understand what animals say
An international group of experts argue that tackling the long-standing challenge of decoding the communication systems of whales, crows, bats, and other animals is coming within reach, following breath-taking advances in artificial intelligence (AI) research.
In an article published in Science today (Friday 14 July), led by Professor Christian Rutz from the School of Biology at the University of St Andrews, the authors explain how cutting-edge machine-learning tools could provide transformative insights into the hidden lives of animals, with important implications for their conservation.
The prospect of understanding what animals say to each other, or of even initiating a conversation with another species, has fired humans’ imagination for millennia. But since there is no Rosetta Stone for translating animals’ communication signals, their meaning must be deciphered through careful observation and experimentation. Despite good research progress over the past few decades, collecting and analysing data is a challenging task. For example, annotating recordings of bird calls, whale songs or primate gestures is time-consuming, and even experienced biologists often struggle to differentiate seemingly similar signal types.
Rutz, an expert on animal behaviour and the use of miniature wildlife tracking devices, said: “The advent of machine learning has created exciting opportunities to make progress with the grand research challenge of understanding other animals. But there are significant risks that must be tackled head-on.”
Machine-learning algorithms effectively function as powerful pattern detectors and content generators. As such, they have revolutionised applications relying on the processing of both written and spoken human language, as illustrated by interactive chatbots. It is these tools researchers are now leveraging to identify and classify animals’ signals from audio and video recordings, and to conduct experiments that illuminate signal function (e.g., by playing back specific calls and observing an animal’s response).
The catch is that machine-learning methods require vast amounts of data. For example, the popular Chat GPT-3 language model was trained using hundreds of billions of ‘tokens’, which are roughly corresponding to words. “That’s the equivalent of over two million books the length of Charles Darwin’s On the Origin of Species,” explains co-author Dr Damián Blasi, who is a language scientist at Harvard University. “We need creative solutions for collecting data for wild animals.”
Major efforts are currently underway to gather suitable datasets for at least some species. Project CETI (Cetacean Translation Initiative), for example, studies the communicative behaviour of sperm whales. The project’s AI Lead, co-author Professor Michael Bronstein, who is the DeepMind Professor of AI at the University of Oxford, explains: “We use gentle, bioinspired whale-mounted tags, underwater robots, and a wide range of other methods to map the full richness of these animals’ communicative behaviour.”
As the authors argue in their article, understanding the communication context is key for making progress. “If we want to decode animal conversations, we need to know who talks to whom, and under what environmental and social conditions,” says co-author Professor Sonja Vernes, an expert on the vocal communication of bats, who holds joint affiliations at the University of St Andrews and the Max Planck Institute for Psycholinguistics. “Machine learning can help us to discover which signals animals are using and perhaps even what the signals mean, if we combine these approaches with well-designed experiments.”
Co-authors Aza Raskin and Katherine Zacarian, who are co-founders of the Earth Species Project (ESP), which studies the communication systems of a wide range of animal species, are particularly excited about the longer-term benefits of this research. “As we expand our understanding of other species’ communicative behaviour, we can use this knowledge to improve animal welfare in captive settings and to design more effective conservation strategies,” notes Zacarian. “Ultimately, we hope to initiate a cultural shift driving greater respect for the many species with which we share planet Earth.”
ESP is collaborating with Rutz and his colleagues on a study investigating the vocal repertoire of the critically endangered Hawaiian crow. Machine learning enables detailed comparisons of the vocalizations of the last surviving individuals, which are all held in conservation breeding centres run by San Diego Zoo Wildlife Alliance, to historical baseline recordings. “Lost calls could potentially be reintroduced,” according to Raskin. “Cultural restoration is a profoundly beautiful example of the benefits of this research.”
In the future, it may even be possible to ‘listen in’ on the well-being of entire animal communities. “If we can identify communication signals that are associated with distress or avoidance, passive acoustic monitoring systems could be used to eavesdrop on how ‘happy’ or ‘unhappy’ animals are at the landscape level,” says Rutz. This would provide a powerful rapid assessment tool for ongoing biodiversity surveys and conservation work.
But the authors agree that major challenges lie ahead, including serious ethical questions – such as under what circumstances initiating conversations with wild animals may be acceptable. “This research promises far-reaching conservation and welfare benefits, but we must urgently come together to discuss its potential risks,” Rutz cautions.
Photos of animals whose communication systems the authors are investigating, or that could be studied productively in the future (including crows, bats, whales, primates, and other species), can be found online.
The Hawaiian crow is one of only two crow species known to use tools for extractive foraging. Machine-learning-assisted analyses of its vocalizations will provide a springboard for field research on the famous New Caledonian crow (see photos), to investigate if their unusually complex tool behaviour is associated with particularly sophisticated forms of communication. Bats (see photos) provide a valuable system for conducting playback experiments under controlled conditions in the laboratory, to test hypotheses about signal function.
The article was co-authored by representatives of two major collaborative initiatives:
- The Earth Species Project (ESP) (Aza Raskin, Katherine Zacarian) is a non-profit research laboratory and impact organisation using advanced machine learning to decode the communication systems of a wide range of animal species. The initiative works with a large number of partners, and has recently shared a detailed technical roadmap in a blog post, outlining its ongoing and proposed scientific work: https://www.earthspecies.org/blog/esp-technical-roadmap
- Project CETI (Cetacean Translation Initiative) (Professor Michael Bronstein) is a non-profit organisation applying advanced machine learning and state-of-the-art robotics to listen to and translate the communication of sperm whales (see photos). The initiative published a detailed scientific roadmap last year, spanning the fields of machine learning and natural language processing, marine biology, cryptography, linguistics, and robotics: https://doi.org/10.1016/j.isci.2022.104393