Abstract:
In response to the challenges in remote sensing fire point detection, such as small target size, high confusion, and complex background, this study proposes an innovative deep learning model named VRFNet, aiming at optimizing the identification of forest fire ignition points. The VRFNet model extracted multi-scale features by integrating wavelet transform and large kernel convolutional decomposition techniques, effectively expanding the model's receptive field and reducing the number of parameters. The model applied average pooling and max pooling to these multi-scale features, combined with an attention mechanism to extract key information, thereby enhancing the expressive power of the features. Through an adaptive receptive field selection mechanism, the model fused features from varying receptive fields through weighted integration, accommodating the scale diversity of fire point targets. The VRFNet achieved accuracies of 86.9% and 73.4% on the DOTA v1.0 and GF-4 datasets, respectively, improving upon the state-of-the-art (SOTA) models by 0.026, thereby confirming its effectiveness in remote sensing image target detection tasks.