Abstract:
Existing RGB-D salient object detection methods mainly usemulti-scale and multimodal fusion by local features, which cannot capture remote dependencies, so the overall characterization ability of features is insufficient. In order to solve the problem, this paper proposes a global-local feature fusion network. In the low-level feature extraction stage, the two branch features were directly fused. In the high-level feature extraction stage, the fused feature was sent to the Transformer encoder to obtain the global feature dependency and sent to the backbone network to extract the global-local fusion feature. At the same time, the two-way attention module was used to enhance the fusion effect of the two branch features. Through experiments on five public data sets, the results show that the strategy network has achieved good performance in four evaluation indicators.