Xue Tianlang, Yue Yutao. TRAFFIC TARGET RECOGNITION AND RETRIEVAL BASED ON TEXT-VISUAL MULTIMODAL LEARNING[J]. Computer Applications and Software.
Citation: Xue Tianlang, Yue Yutao. TRAFFIC TARGET RECOGNITION AND RETRIEVAL BASED ON TEXT-VISUAL MULTIMODAL LEARNING[J]. Computer Applications and Software.

TRAFFIC TARGET RECOGNITION AND RETRIEVAL BASED ON TEXT-VISUAL MULTIMODAL LEARNING

  • As the key to natural language processing, the models are directly related to the final performance. This paper introduces the models involved in natural language processing. According to the methods of rules and statistics, the traditional natural language processing models were introduced in terms of release time, characteristics, advantages and disadvantages, and scope of application. The neural network was divided into different types according to different technologies, and each type was introduced and its corresponding characteristics were summarized.Abstract A lightweight text-visual multimodal retrieval method is proposed to address the issue of low accuracy in existing traffic object recognition and retrieval approaches based on CLIP. This method first leverages the zero-shot learning capability of the CLIP framework to effectively recognize out-of-distribution objects beyond the training samples in real-world scenarios. Subsequently, it constructs a video-image retrieval system that uses user text descriptions as queries, integrates an object detection model, and employs a combination of real-time object detection models and object clustering algorithms to achieve effective object recognition and retrieval based on text in real traffic scenes. Experimental results on the Open-Transmind dataset demonstrate that compared to transfer learning models based on CLIP, this method can approximately double the F1-score, improve inference speed by 40%, and has lower GPU memory usage, thereby validating the effectiveness of the proposed approach.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return