查询结果:   周远侠,于津.基于深度学习的图片问答系统设计研究[J].计算机应用与软件,2018,35(12):199 - 208.
中文标题
基于深度学习的图片问答系统设计研究
发表栏目
人工智能与识别
摘要点击数
799
英文标题
DESIGN OF IMAGE QUESTION AND ANSWER SYSTEM BASED ON DEEP LEARNING
作 者
周远侠 于津 Zhou Yuanxia Yu Jin
作者单位
汕头大学工学院计算机科学与技术系 广东 汕头 515000     
英文单位
Department of Computer Science and Technology, College of Engineering, Shantou University, Shantou 515000, Guangdong, China     
关键词
视觉问答 对话系统 自然语言处理 卷积神经网络 循环神经网络
Keywords
Visual question answering Dialogue system Natural language processing Convolutional neural network Recurrent neural network
基金项目
作者资料
周远侠,硕士,主研领域:机器学习,深度学习。于津,副教授。 。
文章摘要
对VQA(Visual Question Answering)数据集进行统计分析,得到相应统计特征,在此基础上提出数据预处理方法:仿聚类法。局部修改VGGNet提取的图像特征,与使用LSTM获取的问题特征连接后通过多层感知器,连接以K个可能输出的softmax分类器构成模型LcVMS。经过低频剔除法与仿聚类法预处理后,LcVMS在数据集上准确率从43.21%提高到44.45%。实验表明,以LcVMS模型为系统应答逻辑的图片问答系统能较好地分辨物体、数量、颜色和位置等信息,在一定程度上可媲美幼儿智商,具备一定的实用价值。
Abstract
In the paper, the statistical analysis of VQA (Visual Question Answering) data set was carried out, and the corresponding statistical characteristics were obtained. On the basis of that, we proposed one kind of data preprocessing method called imitation clustering. We modified the image features extracted by VGNet locally, connected them with the question features obtained by LSTM, and connected them with K possible output softmax classifiers to form our model LcVMS through multi-layer perceptron. After low frequency rejection and imitation clustering pretreatment, the accuracy of LcVMS on data sets increased from 43.21% to 44.45%. LcVMS model was used as the response logic to construct an image question and answer system. The experimental results show that the system can distinguish information such as the object, the quantity, the color and the position. To a certain extent, it has young children’s IQ, and has certain practical value. 
下载PDF全文