Comprehensive Performance Evaluation of YOLOv12, YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments

Ranjan Sapkota·Zhichao Meng·Martin Churuvija·Xiaoqiang Du·Zenghong Ma·Manoj Karkee·2024

Abstract

This study systematically conducted an extensive real-world evaluation of all configurations of You Only Look Once (YOLO)-based object detection algorithms, including YOLOv8, YOLOv9, YOLOv10, YOLO11, and YOLOv12. Models were assessed using precision, recall, mean Average Precision at 50 % Intersection over Union (mAP@50), and computational efficiency across pre-processing, inference, and post-processing stages for detecting immature green fruitlets in commercial orchards. Field-level fruitlet counting was also validated using images captured with both Intel RealSense and iPhone 14 Pro Max sensors. YOLOv12l achieved the highest recall (0.900), while YOLOv10x and YOLOv9 GELAN-c reported the top precision scores of 0.908 and 0.903, respectively. YOLOv9 GELAN-base and GELAN-e achieved the highest mAP@50 (0.935), followed by YOLO11s (0.933) and YOLOv12l (0.931). In counting validation, YOLO11n demonstrated superior accuracy, with RMSE values of 4.51-4.96 and MAE values of 3.85-7.73 across four apple varieties. Sensor-specific training on Intel RealSense further improved detection performance. YOLO11n also recorded the fastest inference speed (2.4 ms), outperforming YOLOv8n, YOLOv9 GELAN-s, YOLOv10n, and YOLOv12n, affirming its suitability for real-time orchard applications.

Related papers

Ranked by semantic similarity — how closely each paper's abstract matches this one (100% = near-identical topic).