From Forest to Zoo: Great Ape Behavior Recognition with ChimpBehave

Read original: arXiv:2405.20025 - Published 5/31/2024 by Michael Fuchs, Emilie Genty, Adrian Bangerter, Klaus Zuberbuhler, Paul Cotofrei
Total Score

0

From Forest to Zoo: Great Ape Behavior Recognition with ChimpBehave

Sign in to get full access

or

If you already have an account, we'll log you in

Overview

  • This paper presents a new dataset and deep learning model for recognizing the behavior of great apes, such as chimpanzees and gorillas, in both captive and wild settings.
  • The dataset, called ChimpBehave, contains over 1 million labeled video frames of great apes engaging in various behaviors, collected from zoos and field research sites.
  • The authors developed a neural network architecture called ChimpBehave that can accurately classify these behaviors with high accuracy, outperforming previous approaches.
  • The model is designed to be robust to variations in camera viewpoint, lighting, and other environmental factors, making it useful for real-world applications in animal behavior research and conservation.

Plain English Explanation

The researchers have developed a new tool to help study the behavior of great apes, like chimpanzees and gorillas, both in zoos and in the wild. They created a large dataset of videos showing these animals doing different things, like grooming, playing, or resting. Using this data, they trained a deep learning model called ChimpBehave to automatically recognize and classify the apes' behaviors.

This is useful because it can save researchers a lot of time and effort compared to manually watching hours of video footage. The ChimpBehave model is designed to work well even in challenging conditions, like when the camera angle or lighting changes. This makes it a powerful tool for animal behavior analysis methods using deep learning and computational analysis of animal behavior.

By applying this technology, scientists can gain new insights into the behaviors and social interactions of great apes, which could inform conservation efforts and our understanding of primate evolution. It's like having a team of expert observers that can analyze footage 24/7, spotting patterns and trends that might be missed by the human eye.

Technical Explanation

The authors of this paper introduce a new dataset called ChimpBehave, which contains over 1 million labeled video frames of great apes (chimpanzees, gorillas, orangutans, and bonobos) engaged in various behaviors, such as grooming, playing, resting, and more. The dataset was collected from both zoo and field research settings to capture the diversity of ape behaviors in different environments.

Building on this dataset, the researchers developed a deep learning model called ChimpBehave that can accurately classify the apes' behaviors from the video frames. The model uses a convolutional neural network architecture with several novel modifications, including attention mechanisms and multi-scale feature extraction, to improve its performance.

The ChimpBehave model was trained and evaluated on the dataset, achieving state-of-the-art accuracy in behavior recognition. It outperformed previous approaches, such as ChimpVLM, which used earlier computer vision techniques.

A key innovation of this work is the model's robustness to variations in camera viewpoint, lighting, and other environmental factors. This makes the ChimpBehave model suitable for real-world applications in primate behavior tracking and analysis, where the conditions can be unpredictable and challenging.

Critical Analysis

The authors acknowledge several limitations of their work, including the potential for bias in the dataset (e.g., over-representation of certain behaviors or individuals) and the need for further validation in diverse field settings. Additionally, the model's performance may be affected by occlusions, background clutter, and other factors not fully addressed in the current research.

While the ChimpBehave model demonstrates impressive accuracy, it is important to consider the ethical implications of using such technology for animal behavior research and conservation. There are valid concerns about the potential for misuse, such as invasion of privacy or disturbing the animals' natural behaviors. Careful consideration of these issues, as well as ongoing collaboration with ethicists and animal welfare experts, will be crucial as this technology is further developed and deployed.

Conclusion

The ChimpBehave dataset and deep learning model represent a significant advancement in the field of research on image recognition technology for animal behavior. By providing a powerful tool for automatically recognizing and analyzing the behavior of great apes, this work has the potential to greatly accelerate and enhance our understanding of primate behavior, social dynamics, and conservation needs. As the technology continues to evolve, it will be important to address the ethical considerations and ensure that it is used responsibly and in the best interest of the animals and the scientific community.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

From Forest to Zoo: Great Ape Behavior Recognition with ChimpBehave
Total Score

0

From Forest to Zoo: Great Ape Behavior Recognition with ChimpBehave

Michael Fuchs, Emilie Genty, Adrian Bangerter, Klaus Zuberbuhler, Paul Cotofrei

This paper addresses the significant challenge of recognizing behaviors in non-human primates, specifically focusing on chimpanzees. Automated behavior recognition is crucial for both conservation efforts and the advancement of behavioral research. However, it is significantly hindered by the labor-intensive process of manual video annotation. Despite the availability of large-scale animal behavior datasets, the effective application of machine learning models across varied environmental settings poses a critical challenge, primarily due to the variability in data collection contexts and the specificity of annotations. In this paper, we introduce ChimpBehave, a novel dataset featuring over 2 hours of video (approximately 193,000 video frames) of zoo-housed chimpanzees, meticulously annotated with bounding boxes and behavior labels for action recognition. ChimpBehave uniquely aligns its behavior classes with existing datasets, allowing for the study of domain adaptation and cross-dataset generalization methods between different visual settings. Furthermore, we benchmark our dataset using a state-of-the-art CNN-based action recognition model, providing the first baseline results for both within and cross-dataset settings. The dataset, models, and code can be accessed at: https://github.com/MitchFuchs/ChimpBehave

Read more

5/31/2024

BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos
Total Score

0

BaboonLand Dataset: Tracking Primates in the Wild and Automating Behaviour Recognition from Drone Videos

Isla Duporge, Maksim Kholiavchenko, Roi Harel, Scott Wolf, Dan Rubenstein, Meg Crofoot, Tanya Berger-Wolf, Stephen Lee, Julie Barreau, Jenna Kline, Michelle Ramirez, Charles Stewart

Using drones to track multiple individuals simultaneously in their natural environment is a powerful approach for better understanding group primate behavior. Previous studies have demonstrated that it is possible to automate the classification of primate behavior from video data, but these studies have been carried out in captivity or from ground-based cameras. To understand group behavior and the self-organization of a collective, the whole troop needs to be seen at a scale where behavior can be seen in relation to the natural environment in which ecological decisions are made. This study presents a novel dataset from drone videos for baboon detection, tracking, and behavior recognition. The baboon detection dataset was created by manually annotating all baboons in drone videos with bounding boxes. A tiling method was subsequently applied to create a pyramid of images at various scales from the original 5.3K resolution images, resulting in approximately 30K images used for baboon detection. The tracking dataset is derived from the detection dataset, where all bounding boxes are assigned the same ID throughout the video. This process resulted in half an hour of very dense tracking data. The behavior recognition dataset was generated by converting tracks into mini-scenes, a video subregion centered on each animal; each mini-scene was manually annotated with 12 distinct behavior types, resulting in over 20 hours of data. Benchmark results show mean average precision (mAP) of 92.62% for the YOLOv8-X detection model, multiple object tracking precision (MOTA) of 63.81% for the BotSort tracking algorithm, and micro top-1 accuracy of 63.97% for the X3D behavior recognition model. Using deep learning to classify wildlife behavior from drone footage facilitates non-invasive insight into the collective behavior of an entire group.

Read more

6/5/2024

ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition
Total Score

0

ChimpVLM: Ethogram-Enhanced Chimpanzee Behaviour Recognition

Otto Brookes, Majid Mirmehdi, Hjalmar Kuhl, Tilo Burghardt

We show that chimpanzee behaviour understanding from camera traps can be enhanced by providing visual architectures with access to an embedding of text descriptions that detail species behaviours. In particular, we present a vision-language model which employs multi-modal decoding of visual features extracted directly from camera trap videos to process query tokens representing behaviours and output class predictions. Query tokens are initialised using a standardised ethogram of chimpanzee behaviour, rather than using random or name-based initialisations. In addition, the effect of initialising query tokens using a masked language model fine-tuned on a text corpus of known behavioural patterns is explored. We evaluate our system on the PanAf500 and PanAf20K datasets and demonstrate the performance benefits of our multi-modal decoding approach and query initialisation strategy on multi-class and multi-label recognition tasks, respectively. Results and ablations corroborate performance improvements. We achieve state-of-the-art performance over vision and vision-language models in top-1 accuracy (+6.34%) on PanAf500 and overall (+1.1%) and tail-class (+2.26%) mean average precision on PanAf20K. We share complete source code and network weights for full reproducibility of results and easy utilisation.

Read more

4/16/2024

👀

Total Score

0

Computer Vision for Primate Behavior Analysis in the Wild

Richard Vogg, Timo Luddecke, Jonathan Henrich, Sharmita Dey, Matthias Nuske, Valentin Hassler, Derek Murphy, Julia Fischer, Julia Ostner, Oliver Schulke, Peter M. Kappeler, Claudia Fichtel, Alexander Gail, Stefan Treue, Hansjorg Scherberger, Florentin Worgotter, Alexander S. Ecker

Advances in computer vision as well as increasingly widespread video-based behavioral monitoring have great potential for transforming how we study animal cognition and behavior. However, there is still a fairly large gap between the exciting prospects and what can actually be achieved in practice today, especially in videos from the wild. With this perspective paper, we want to contribute towards closing this gap, by guiding behavioral scientists in what can be expected from current methods and steering computer vision researchers towards problems that are relevant to advance research in animal behavior. We start with a survey of the state-of-the-art methods for computer vision problems that are directly relevant to the video-based study of animal behavior, including object detection, multi-individual tracking, individual identification, and (inter)action recognition. We then review methods for effort-efficient learning, which is one of the biggest challenges from a practical perspective. Finally, we close with an outlook into the future of the emerging field of computer vision for animal behavior, where we argue that the field should develop approaches to unify detection, tracking, identification and (inter)action recognition in a single, video-based framework.

Read more

8/13/2024