Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics

Read original: arXiv:2404.16436 - Published 5/8/2024 by Ben Williams, Bart van Merrienboer, Vincent Dumoulin, Jenny Hamer, Eleni Triantafillou, Abram B. Fleishman, Matthew McKown, Jill E. Munger, Aaron N. Rice, Ashlee Lillis and 5 others
Total Score

0

๐Ÿ”„

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • Machine learning can revolutionize passive acoustic monitoring (PAM) for ecological assessments
  • High annotation and compute costs limit the field's effectiveness
  • Generalizable pretrained networks can overcome these costs, but require vast annotated libraries, limiting their applicability to bird taxa
  • This research explores an optimum pretraining strategy for a data-deficient domain: coral reef bioacoustics

Plain English Explanation

Passive acoustic monitoring (PAM) involves using microphones to listen to and analyze the sounds in an environment, like a forest or ocean. This can provide valuable insights into the local animal species and their behaviors. Machine learning models can automate the analysis of these audio recordings, but building the required data sets is costly and time-consuming.

To address this, researchers have been developing generalizable pretrained models that can be quickly adapted to new audio domains. However, these models typically rely on large annotated data sets, which are mostly available for bird sounds.

This study explores an alternative approach for a data-deficient domain: coral reef bioacoustics. The researchers assembled a modest annotated library of reef sounds, called ReefSet, and tested different pretraining strategies to see which one works best for analyzing these reef recordings.

Their key finding is that mixing audio from birds, reefs, and unrelated sources during pretraining maximizes the model's ability to generalize to the reef domain. This "cross-domain mixing" approach provides a strong foundation for automating the analysis of marine PAM data with minimal annotation and compute requirements.

Technical Explanation

The researchers first assembled ReefSet, a large annotated library of coral reef sounds, though it is modest compared to bird audio libraries, containing only about 2% as many samples.

They then tested the performance of few-shot transfer learning using different pretraining strategies:

  1. Pretraining on bird audio alone
  2. Pretraining on ReefSet alone
  3. Pretraining on a mix of bird, reef, and unrelated audio

Their results showed that pretraining on bird audio provided notably superior generalizability to the reef domain compared to the other two approaches. However, the key finding was that the cross-domain mixing strategy, which leverages all three audio sources, maximized the model's reef generalizability.

The researchers call their pretrained model "SurfPerch" and propose it as a strong foundation for automated analysis of marine PAM data, requiring minimal annotation and compute costs compared to previous approaches.

Critical Analysis

The researchers acknowledge that the ReefSet library is modest in size compared to the large annotated bird audio data sets that have driven much of the progress in this field. This limited data availability is a common challenge in many real-world domains, and the researchers' work represents an important step in overcoming it.

While the cross-domain mixing approach demonstrated promising results, it is possible that even more sophisticated techniques, such as continual learning or multi-task learning, could further improve the model's ability to generalize. Additionally, the researchers did not explore the model's performance on non-reef marine environments, which may require additional adaptation.

Overall, this research provides a valuable contribution to the field of passive acoustic monitoring by demonstrating an effective approach for building generalizable models in data-scarce domains. The SurfPerch model and the insights from this study have the potential to significantly reduce the barriers to deploying automated acoustic analysis in a wide range of ecological applications.

Conclusion

This research presents a novel approach to overcoming the high annotation and compute costs that have historically limited the effectiveness of passive acoustic monitoring for ecological assessments. By leveraging cross-domain mixing during pretraining, the researchers developed a generalizable model called SurfPerch that can be applied to data-deficient domains, such as coral reef bioacoustics, with minimal additional effort.

The findings of this study demonstrate the power of machine learning to revolutionize PAM and unlock new possibilities for large-scale, automated monitoring of ecosystems. As the researchers continue to refine and expand their techniques, their work has the potential to have a significant impact on our understanding and conservation of diverse environments around the world.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on ๐• โ†’

Related Papers

๐Ÿ”„

Total Score

0

Leveraging tropical reef, bird and unrelated sounds for superior transfer learning in marine bioacoustics

Ben Williams, Bart van Merrienboer, Vincent Dumoulin, Jenny Hamer, Eleni Triantafillou, Abram B. Fleishman, Matthew McKown, Jill E. Munger, Aaron N. Rice, Ashlee Lillis, Clemency E. White, Catherine A. D. Hobbs, Tries B. Razak, Kate E. Jones, Tom Denton

Machine learning has the potential to revolutionize passive acoustic monitoring (PAM) for ecological assessments. However, high annotation and compute costs limit the field's efficacy. Generalizable pretrained networks can overcome these costs, but high-quality pretraining requires vast annotated libraries, limiting its current applicability primarily to bird taxa. Here, we identify the optimum pretraining strategy for a data-deficient domain using coral reef bioacoustics. We assemble ReefSet, a large annotated library of reef sounds, though modest compared to bird libraries at 2% of the sample count. Through testing few-shot transfer learning performance, we observe that pretraining on bird audio provides notably superior generalizability compared to pretraining on ReefSet or unrelated audio alone. However, our key findings show that cross-domain mixing which leverages bird, reef and unrelated audio during pretraining maximizes reef generalizability. SurfPerch, our pretrained network, provides a strong foundation for automated analysis of marine PAM data with minimal annotation and compute costs.

Read more

5/8/2024

๐Ÿ”Ž

Total Score

0

New!Domain-Invariant Representation Learning of Bird Sounds

Ilyass Moummad, Romain Serizel, Emmanouil Benetos, Nicolas Farrugia

Passive acoustic monitoring (PAM) is crucial for bioacoustic research, enabling non-invasive species tracking and biodiversity monitoring. Citizen science platforms like Xeno-Canto provide large annotated datasets from focal recordings, where the target species is intentionally recorded. However, PAM requires monitoring in passive soundscapes, creating a domain shift between focal and passive recordings, which challenges deep learning models trained on focal recordings. To address this, we leverage supervised contrastive learning to improve domain generalization in bird sound classification, enforcing domain invariance across same-class examples from different domains. We also propose ProtoCLR (Prototypical Contrastive Learning of Representations), which reduces the computational complexity of the SupCon loss by comparing examples to class prototypes instead of pairwise comparisons. Additionally, we present a new few-shot classification benchmark based on BirdSet, a large-scale bird sound dataset, and demonstrate the effectiveness of our approach in achieving strong transfer performance.

Read more

9/17/2024

Towards Deep Active Learning in Avian Bioacoustics
Total Score

0

Towards Deep Active Learning in Avian Bioacoustics

Lukas Rauch, Denis Huseljic, Moritz Wirth, Jens Decke, Bernhard Sick, Christoph Scholz

Passive acoustic monitoring (PAM) in avian bioacoustics enables cost-effective and extensive data collection with minimal disruption to natural habitats. Despite advancements in computational avian bioacoustics, deep learning models continue to encounter challenges in adapting to diverse environments in practical PAM scenarios. This is primarily due to the scarcity of annotations, which requires labor-intensive efforts from human experts. Active learning (AL) reduces annotation cost and speed ups adaption to diverse scenarios by querying the most informative instances for labeling. This paper outlines a deep AL approach, introduces key challenges, and conducts a small-scale pilot study.

Read more

6/28/2024

๐Ÿ“Š

Total Score

0

Automated Bioacoustic Monitoring for South African Bird Species on Unlabeled Data

Michael Doell, Dominik Kuehn, Vanessa Suessle, Matthew J. Burnett, Colleen T. Downs, Andreas Weinmann, Elke Hergenroether

Analyses for biodiversity monitoring based on passive acoustic monitoring (PAM) recordings is time-consuming and challenged by the presence of background noise in recordings. Existing models for sound event detection (SED) worked only on certain avian species and the development of further models required labeled data. The developed framework automatically extracted labeled data from available platforms for selected avian species. The labeled data were embedded into recordings, including environmental sounds and noise, and were used to train convolutional recurrent neural network (CRNN) models. The models were evaluated on unprocessed real world data recorded in urban KwaZulu-Natal habitats. The Adapted SED-CRNN model reached a F1 score of 0.73, demonstrating its efficiency under noisy, real-world conditions. The proposed approach to automatically extract labeled data for chosen avian species enables an easy adaption of PAM to other species and habitats for future conservation projects.

Read more

6/21/2024