IMPLICIT TEXT-IMAGE FINE-GRAINED MATCHING VIA VISUAL CONTRASTIVE ATTENTION

YinYajue; WangJingjing

doi:10.3969/j.issn.1000-386x.2025.07.020

YinYajue, WangJingjing. IMPLICIT TEXT-IMAGE FINE-GRAINED MATCHING VIA VISUAL CONTRASTIVE ATTENTION[J]. Computer Applications and Software, 2025, 42(7): 148-154,160. DOI: 10.3969/j.issn.1000-386x.2025.07.020

Citation:

IMPLICIT TEXT-IMAGE FINE-GRAINED MATCHING VIA VISUAL CONTRASTIVE ATTENTION

Graphical Abstract

Graphical Abstract

Abstract

Abstract

The text-image fine-grained matching task aims to align fine-grained entities in pictures and texts (eg: aligning target objects in pictures with phrase involved in text). Different from previous studies, this paper proposes a novel implicit scene-oriented text-image fine-grained matching task, which focuses on processing fine-grained matching relationships that need to rely on context or more external knowledge to identify. In particular, for this new task, this paper formulated a corresponding corpus annotation specification and annotated a text-image fine-grained matching dataset for implicit scenes. On this basis, this paper proposed a method based on visual contrastive attention to alleviate the problem of sparse semantic matching information in this new task. Experimental results show that the proposed method of visual contrastive attention achieves significant performance improvement on implicit matching task.

FullText(HTML)

References (0)

Cited By

Turn off MathJax

Article Contents

IMPLICIT TEXT-IMAGE FINE-GRAINED MATCHING VIA VISUAL CONTRASTIVE ATTENTION

Graphical Abstract

Abstract

Catalog

Export File

Citation

Format

Content