摘要
针对目前基于语义角色的实体及关系抽取技术效果不理想以及存在无法正确提取多个宾语的缺陷,提出一种基于语义角色和依存关系融合的方法:1)通过语义角色标注的方式抽取主语和核心谓词;2)以核心谓词为切入点,通过依存句法关系分析句子的并列结构(COO)和动宾结构(VOB),抽取其中的宾语实体;3)整合主语、核心谓词和宾语构成[实体关系实体]三元组。对该算法和单纯依赖语义角色识别的算法进行了对比实验,结果显示该算法的精确率、召回率、F1值3个指标更优,表明这种实体关系抽取的方法可行有效,且在抽取多宾语的任务中效果明显。
The entity and relationship extraction technology based on semantic roles labeling is not effecctive enough to correctly extract multiple objects. This paper proposes a method based on semantic roles labeling and dependency parsing: 1) Extracting subjects and core predicates through semantic roles labeling; 2) Taking the core predicate as the entry point, analyzing the COO structure and VOB structure of the sentence through the dependence sentence relationship, and extracting the object entity among them; 3) Integrating subject, core predicate and object to form [ Entity Relation Entity] triple entities. The algorithms based on text and semantic role labeling are compared, and the results show that the text algorithm has better accuracy, recall rate and F1 value, which shows that this method is feasible and effective, and the effect is obvious in the task of extracting multiple objects.
引文
[1] 李保利,陈玉忠,俞士汶.信息抽取研究综述[J].计算机工程与应用,2003(10):1-5.
[2] 高莹,侯凌燕,刘秀磊.基于本体的煤矿瓦斯知识库建设[J]北京信息科技大学学报,2017,32(05):50-55.
[3] 刘鑫,常大俊,刘清雪.自动问答系统中课程知识本体的构建与实现[J].电子技术与软件工程,2014(05):177-178.
[4] 刘峤,李杨,段宏,等.知识图谱构建技术综述[J].计算机研究与发展,2016,53(03):582-600.
[5] 侯一民,周慧琼,王政一.深度学习在语音识别中的研究进展综述[J].计算机应用研究,2017,34(08):2241-2246.
[6] 徐芬,王挺,陈火旺.基于SVM方法的中文实体关系抽取[C].第九届全国计算语言学学术会议论文集.国防科学技术大学,2007:497-502.
[7] Proceedings of the 6th message understanding conference(MUC-7)[C]//[S.1.]:National Institute of standars and Technology,1998.
[8] 毛小丽,何中市,邢欣来,等.基于语义角色的实体关系抽取[J].计算机工程,2011,37(17):143-145.
[9] 郭喜跃,何婷婷,胡小华,等.基于句法语义特征的中文实体关系抽取[J].中文信息学报,2014,28(06):183-189.
[10] 甘丽新,万常选,刘德喜,等.基于句法语义特征的中文实体关系抽取[J].计算机研究与发展,2016,53(02):284-302.
[11] 武文雅,陈钰枫,徐金安,等.中文实体关系抽取研究综述[J].计算机与现代化,2018(08):21-27.
[12] 唐敏.基于深度学习的中文实体关系抽取方法研究[D].成都:西南交通大学,2018.
[13] Lin Yankai,Liu Zhiyua,Sun Maosong.Neural relation extraction with multi-lingual attention[C]//Proceedings of the 55th Annual Meeting of the Association for Computa- tional Linguistics.2017:34-43.
[14] 刘怀军,车万翔,刘挺.中文语义角色标注的特征工程[J].中文信息学报,2007(01):79-84
[15] 周浩.基于神经网络的句法分析研究[D].南京:南京大学,2017.
[16] 李航.统计学习方法[D].北京:清华大学出版社,2012.