查询结果:   雷树杰,邢富坤,王闻慧.融合多类型特征的特定领域实体识别研究[J].计算机应用与软件,2019,36(11):210 - 217.
中文标题
融合多类型特征的特定领域实体识别研究
发表栏目
人工智能与识别
摘要点击数
333
英文标题
DOMAIN-SPECIFIC ENTITY RECOGNITION WITH THE SUPPORT OF MULTI-TYPE FEATURES
作 者
雷树杰 邢富坤 王闻慧 Lei Shujie Xing Fukun Wang Wenhui
作者单位
战略支援部队信息工程大学洛阳校区 河南 洛阳 471003 青岛大学外语学院 山东 青岛 266000    
英文单位
Luoyang Campus, Information Engineering University of PLA Strategic Support Forces, Luoyang 471003, Henan, China School of Foreign Languages, Qingdao University, Qingdao 266000, Shandong, China    
关键词
英文武器装备名 Bi-LSTM+CRF 多类型特征 特征分析
Keywords
English military equipment name Bi-LSTM+CRF Multi-type feature Feature analysis
基金项目
作者资料
雷树杰,硕士生,主研领域:自然语言处理。邢富坤,教授。王闻慧,硕士生。 。
文章摘要
特定领域实体具有分布稀疏、类型有限、领域性强等特点,与普通命名实体具有较大差别,在使用神经网络模型构建识别模型中面临训练语料规模有限、带标实体稀疏等困难。以武器装备名识别为例,研究深度学习框架下,词性、句法和领域知识融入神经网络模型的方法和效果。实验结果表明,在融入词性和领域知识后,武器装备名识别的F值分别提升了0.97%与9.5%。此外,通过在不同语料规模下进行实验并定量分析不同类型特征的分布特点,初步给出造成不同类型特征对深度学习模型有着不同支持作用的原因。
Abstract
The domain-specific entities have the characteristics of sparse distribution, limited types and strong domains. They are quite different from ordinary named entities. It is difficult to construct recognition model by using neural network model due to the limited size of training corpus and sparse labeled entities. Taking the identification of military equipment names as example, we study the method and effect of the integration of part of speech, syntax and domain knowledge into the neural network model under the framework of depth learning. The experimental results show that after the integration of part of speech and domain knowledge, the F value of military equipment name recognition increases by 0.97% and 9.5% respectively. By conducting experiments under different corpus size and quantitatively analyzing the distribution characteristics of different types of features, the reasons that different types of features have different supporting effects for deep learning are given.
下载PDF全文