Strawberry detection and counting based on YOLOv7 pruning and information based tracking algorithm

Read original: arXiv:2407.12614 - Published 7/18/2024 by Shiyu Liu, Congliang Zhou, Won Suk Lee

🔎

Overview

Florida's strawberry industry is economically significant, but monitoring strawberry growth and yield is labor-intensive and costly.
Previous studies have applied deep learning for flower and fruit detection, but did not consider the unique characteristics of image datasets from machine vision systems.
This study proposes an optimal pruning of detection heads in the YOLOv7 deep learning model to achieve fast and precise detection of strawberry flowers, immature fruit, and mature fruit.
It also introduces an enhanced object tracking algorithm called the Information Based Tracking Algorithm (IBTA) that improves precision in tracking strawberry flowers and fruits.

Plain English Explanation

Growing and harvesting strawberries is an important business in Florida, but the process of monitoring the plants and predicting how much fruit they will produce requires a lot of manual labor, which is expensive. Researchers have tried using machine learning, a type of artificial intelligence, to automatically detect and track strawberry flowers and fruits. However, the previous approaches had some limitations because they didn't fully account for the unique characteristics of the images captured by the camera systems used to monitor the strawberry fields.

This new study tackled these challenges in two ways. First, the researchers optimized a popular deep learning model called YOLOv7, which is used for object detection, by "pruning" or removing some of the model's detection "heads" (the parts that identify different objects). This allowed the model to work faster and more accurately at detecting strawberry flowers, immature fruits, and mature fruits.

Second, the researchers developed an enhanced object tracking algorithm called IBTA, which uses information about the direction, speed, and location of the detected flowers and fruits to better follow them over time. This IBTA algorithm was shown to be more effective at tracking the strawberry plants than a simpler tracking method.

By improving both the object detection and tracking capabilities, this research has the potential to make it easier and more affordable for strawberry growers to monitor their crops and predict yields, leading to enhanced pollinator conservation towards agriculture 4.0 monitoring.

Technical Explanation

The researchers in this study used the YOLOv7 deep learning model as the foundation for their strawberry flower and fruit detection system. YOLOv7 is a state-of-the-art object detection model that can quickly and accurately identify different objects in images. However, the researchers found that the standard YOLOv7 model did not perform optimally on the unique image datasets collected by the machine vision system used to monitor the strawberry fields.

To address this, the researchers experimented with "pruning" the detection heads of the YOLOv7 model. This means they selectively removed certain parts of the model that were responsible for detecting different types of objects, leaving only the heads focused on strawberry flowers, immature fruits, and mature fruits. This pruning process allowed the model to run faster (up to 163.9 frames per second) while maintaining high detection accuracy (up to 89.1%).

Additionally, the researchers developed a new object tracking algorithm called the Information Based Tracking Algorithm (IBTA). IBTA uses information about the moving direction, velocity, and spatial location of the detected strawberry flowers and fruits to more precisely track them over time, compared to a simpler centroid tracking algorithm. Evaluations showed that IBTA outperformed the centroid tracker, with a 12.3% higher Multiple Object Tracking Accuracy (MOTA) and a 6.0% higher Multiple Object Tracking Precision (MOTP).

This combined approach of optimized detection and enhanced tracking has the potential to significantly improve the automated monitoring and yield prediction capabilities for strawberry growers, reducing the need for labor-intensive manual processes. The findings from this research may also be applicable to detecting and tracking other types of fruit crops, such as apples or evaluating the performance of different object detection models.

Critical Analysis

While the researchers' approach of pruning the YOLOv7 detection heads and developing the IBTA tracking algorithm showed promising results, there are a few potential limitations and areas for further exploration:

Dataset Characteristics: The researchers noted that the unique characteristics of the machine vision image datasets used in this study presented challenges for the standard deep learning models. It would be valuable to further investigate the specific factors, such as lighting conditions, camera angles, or image resolution, that contribute to these dataset characteristics and how they can be addressed.
Generalizability: The researchers tested their methods on strawberry crops, but it's unclear how well the pruned YOLOv7 model and the IBTA tracker would perform on other types of fruit crops, like immature green apples or in different agricultural settings. Further evaluation on a wider range of datasets would help determine the broader applicability of the techniques.
Real-world Deployment: While the reported inference speeds and detection accuracies are promising, it's important to consider the practical aspects of deploying such a system in an actual strawberry farm environment. Factors like power consumption, hardware requirements, and integration with existing farm management systems should be explored.

Overall, this study presents a valuable contribution to the field of automated crop monitoring and yield prediction, with the potential to significantly benefit the strawberry industry. The researchers' innovative approaches to detection and tracking warrant further investigation and refinement to maximize the practical impact of this work.

Conclusion

This study tackled the challenge of automating the monitoring and yield prediction process for Florida's strawberry industry, which is currently a labor-intensive and costly endeavor. By optimizing the YOLOv7 deep learning model through pruning of detection heads and developing an enhanced object tracking algorithm (IBTA), the researchers were able to achieve fast and precise detection of strawberry flowers, immature fruits, and mature fruits, as well as improved tracking accuracy.

These advancements have the potential to revolutionize the way strawberry growers manage their crops, reducing the need for manual labor and enabling more efficient yield forecasting. The findings from this research may also be applicable to the monitoring and detection of other types of fruit crops, contributing to the ongoing efforts to enhance pollinator conservation towards agriculture 4.0.

While the study presents promising results, further investigation into the dataset characteristics, generalizability, and practical deployment considerations will be important to fully realize the benefits of this innovative approach to automated crop monitoring.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🔎

Strawberry detection and counting based on YOLOv7 pruning and information based tracking algorithm

Shiyu Liu, Congliang Zhou, Won Suk Lee

The strawberry industry yields significant economic benefits for Florida, yet the process of monitoring strawberry growth and yield is labor-intensive and costly. The development of machine learning-based detection and tracking methodologies has been used for helping automated monitoring and prediction of strawberry yield, still, enhancement has been limited as previous studies only applied the deep learning method for flower and fruit detection, which did not consider the unique characteristics of image datasets collected by the machine vision system. This study proposed an optimal pruning of detection heads of the deep learning model (YOLOv7 and its variants) that could achieve fast and precise strawberry flower, immature fruit, and mature fruit detection. Thereafter, an enhanced object tracking algorithm, which is called the Information Based Tracking Algorithm (IBTA) utilized the best detection result, removed the Kalman Filter, and integrated moving direction, velocity, and spatial information to improve the precision in strawberry flower and fruit tracking. The proposed pruning of detection heads across YOLOv7 variants, notably Pruning-YOLOv7-tiny with detection head 3 and Pruning-YOLOv7-tiny with heads 2 and 3 achieved the best inference speed (163.9 frames per second) and detection accuracy (89.1%), respectively. On the other hand, the effect of IBTA was proved by comparing it with the centroid tracking algorithm (CTA), the Multiple Object Tracking Accuracy (MOTA) and Multiple Object Tracking Precision (MOTP) of IBTA were 12.3% and 6.0% higher than that of CTA, accordingly. In addition, other object-tracking evaluation metrics, including IDF1, IDR, IDP, MT, and IDs, show that IBTA performed better than CTA in strawberry flower and fruit tracking.

7/18/2024

Performance Evaluation of YOLOv8 Model Configurations, for Instance Segmentation of Strawberry Fruit Development Stages in an Open Field Environment

Abdul-Razak Alhassan Gamani, Ibrahim Arhin, Adrena Kyeremateng Asamoah

Accurate identification of strawberries during their maturing stages is crucial for optimizing yield management, and pest control, and making informed decisions related to harvest and post-harvest logistics. This study evaluates the performance of YOLOv8 model configurations for instance segmentation of strawberries into ripe and unripe stages in an open field environment. The YOLOv8n model demonstrated superior segmentation accuracy with a mean Average Precision (mAP) of 80.9%, outperforming other YOLOv8 configurations. In terms of inference speed, YOLOv8n processed images at 12.9 milliseconds, while YOLOv8s, the least-performing model, processed at 22.2 milliseconds. Over 86 test images with 348 ground truth labels, YOLOv8n detected 235 ripe fruit classes and 51 unripe fruit classes out of 251 ground truth ripe fruits and 97 unripe ground truth labels, respectively. In comparison, YOLOv8s detected 204 ripe fruits and 37 unripe fruits. Overall, YOLOv8n achieved the fastest inference speed of 24.2 milliseconds, outperforming YOLOv8s, YOLOv8m, YOLOv8l, and YOLOv8x, which processed images at 33.0 milliseconds, 44.3 milliseconds, 53.6 milliseconds, and 62.5 milliseconds, respectively. These results underscore the potential of advanced object segmentation algorithms to address complex visual recognition tasks in open-field agriculture effectively to address complex visual recognition tasks in open-field agriculture effectively.

8/14/2024

🔎

Precise Apple Detection and Localization in Orchards using YOLOv5 for Robotic Harvesting Systems

Jiang Ziyue, Yin Bo, Lu Boyun

The advancement of agricultural robotics holds immense promise for transforming fruit harvesting practices, particularly within the apple industry. The accurate detection and localization of fruits are pivotal for the successful implementation of robotic harvesting systems. In this paper, we propose a novel approach to apple detection and position estimation utilizing an object detection model, YOLOv5. Our primary objective is to develop a robust system capable of identifying apples in complex orchard environments and providing precise location information. To achieve this, we curated an autonomously labeled dataset comprising diverse apple tree images, which was utilized for both training and evaluation purposes. Through rigorous experimentation, we compared the performance of our YOLOv5-based system with other popular object detection models, including SSD. Our results demonstrate that the YOLOv5 model outperforms its counterparts, achieving an impressive apple detection accuracy of approximately 85%. We believe that our proposed system's accurate apple detection and position estimation capabilities represent a significant advancement in agricultural robotics, laying the groundwork for more efficient and sustainable fruit harvesting practices.

5/13/2024

🚀

Comprehensive Performance Evaluation of YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments

Ranjan Sapkota, Zhichao Meng, Martin Churuvija, Xiaoqiang Du, Zenghong Ma, Manoj Karkee

This study performed an extensive evaluation of the performances of all configurations of YOLOv8, YOLOv9, and YOLOv10 object detection algorithms for fruitlet (of green fruit) detection in commercial orchards. Additionally, this research performed and validated in-field counting of fruitlets using an iPhone and machine vision sensors in 5 different apple varieties (Scifresh, Scilate, Honeycrisp, Cosmic crisp & Golden delicious). This comprehensive investigation of total 17 different configurations (5 for YOLOv8, 6 for YOLOv9 and 6 for YOLOv10) revealed that YOLOv9 outperforms YOLOv10 and YOLOv8 in terms of mAP@50, while YOLOv10x outperformed all 17 configurations tested in terms of precision and recall. Specifically, YOLOv9 Gelan-e achieved the highest mAP@50 of 0.935, outperforming YOLOv10n's 0.921 and YOLOv8s's 0.924. In terms of precision, YOLOv10x achieved the highest precision of 0.908, indicating superior object identification accuracy compared to other configurations tested (e.g. YOLOv9 Gelan-c with a precision of 0.903 and YOLOv8m with 0.897. In terms of recall, YOLOv10s achieved the highest in its series (0.872), while YOLOv9 Gelan m performed the best among YOLOv9 configurations (0.899), and YOLOv8n performed the best among the YOLOv8 configurations (0.883). Meanwhile, three configurations of YOLOv10: YOLOv10b, YOLOv10l, and YOLOv10x achieved superior post-processing speeds of 1.5 milliseconds, outperforming all other configurations within the YOLOv9 and YOLOv8 families. Specifically, YOLOv9 Gelan-e recorded a post-processing speed of 1.9 milliseconds, and YOLOv8m achieved 2.1 milliseconds. Furthermore, YOLOv8n exhibited the highest inference speed among all configurations tested, achieving a processing time of 4.1 milliseconds while YOLOv9 Gelan-t and YOLOv10n also demonstrated comparatively slower inference speeds of 9.3 ms and 5.5 ms, respectively.

8/28/2024