Abstract:
Unmanned aerial vehicle images are widely used now in many applications. However, these images face many challenges such as dense distribution of small objects, variable object scales, and inconspicuous edge features, leading to missed and false object detection. To address these issues, this paper proposes a lightweight enhanced solution based on the improved YOLOv11n model RLD-YOLO. The algorithm combines RepConv structural reparameterisation technology, retaining multi-branch feature expression capabilities during training and automatically converting to an efficient single-branch structure during inference. It designs the LKAConv large kernel attention module to enhance the feature capture ability of small targets through 7$\times$7 depthwise separable convolution and spatial attention mechanism. It introduces the DASI dynamic adaptive fusion module to optimise multi-scale feature interaction through learnable weight allocation. Experimental results show that the improved RLD-YOLO object detection algorithm, which integrates LKAConv, RepConv, and DASI, increases mAP50 and mAP50-95 by 2.02 and 1.17 % respectively on the VisDrone2019-DET dataset. The post-processing speed is optimised by 9.09 %. Although the pre-processing time increases due to feature enhancement operations, the critical inference stage still maintains a real-time performance of 1.7 ms/frame. The RLD-YOLO model, fused with LKAConv, RepConv, and DASI, is very suitable for the task of small object target detection in unmanned aerial vehicle images.