基于领域知识图谱增强和Lattice-LSTM的中医药命名实体识别

TRADITIONAL CHINESE MEDICINE NAMED ENTITY RECOGNITION BASED ON DOMAIN KNOWLEDGE GRAPH ENHANCEMENT AND LATTICE-LSTM

  • 摘要: 针对中医药领域命名实体识别任务中,现有的通过构造词典对实体识别模型进行增强的方法中存在的专业术语发现困难、构造词典效率低下和识别准确率不足等问题,提出一种基于领域知识图谱增强和Lattice-LSTM的领域命名实体识别模型。通过对已经构建完成的领域图谱使用嵌入算法,将其快速高效地转化为领域词典,并使用融合多粒度词汇信息的Lattice-LSTM将词典中的专业词汇编码到模型的输入中去,从而提高了模型在领域实体识别任务上的效果。采用中医药数据集进行实验,结果表明,所提模型的F1值高于传统实体识别模型,验证了模型的有效性。

     

    Abstract: Aiming at the problems in existing methods for enhancing entity recognition models through lexicon construction in Traditional Chinese Medicine (TCM) named entity recognition tasks—including difficulties in discovering domain-specific terms, inefficient dictionary construction, and insufficient recognition accuracy—this study proposes a domain-specific named entity recognition model based on domain knowledge graph enhancement and Lattice-LSTM. By applying embedding algorithms to a pre-constructed domain knowledge graph, we efficiently convert it into a domain-specific lexicon and incorporate multi-granularity lexical information through Lattice-LSTM to encode professional vocabulary in the dictionary into the model's input, thereby improving the model's effectiveness in domain-specific entity recognition tasks. Experiments on TCM datasets show that the F1-score of the proposed model is higher than that of traditional entity recognition models, verifying the model's validity.

     

/

返回文章
返回