WANG Xuemei,TAO Hongcai.Research on Chinese Named Entity Recognition based on Deep Learning[J].Journal of Chengdu University of Information Technology,2020,35(03):264-270.[doi:10.16836/j.cnki.jcuit.2020.03.003]
基于深度学习的中文命名实体识别研究
- Title:
- Research on Chinese Named Entity Recognition based on Deep Learning
- 文章编号:
- 2096-1618(2020)03-0264-07
- Keywords:
- Chinese named entity recognition; BERT; BiGRU; Attention; CRF
- 分类号:
- TP391
- 文献标志码:
- A
- 摘要:
- 针对经典BiLSTM-CRF命名实体识别模型训练时间长、无法解决一词多义及不能充分学习文本上下文语义信息的问题,提出一种基于BERT-BiGRU-Attention-CRF的中文命名实体识别模型。首先利用BERT语言模型预训练词向量,以弥补传统词向量模型无法解决一词多义的问题; 其次,利用双向门控循环单元(BiGRU)神经网络层对文本深层次的信息进行特征提取,计算每个标签的预测分值,得到句子的隐藏状态序列; 然后利用注意力机制(Attention)层对词加权表征,挖掘词间的关联关系,得到新预测分值,新状态序列; 最后通过条件随机场(CRF)对新预测分值计算全局最优解,从而获得模型对实体标签的最终预测结果。通过在MSRA语料上的实验,结果表明文中模型的有效性。
- Abstract:
- Aiming at the problems of long training time of classic BiLSTM-CRF named entity recognition model, inability to resolve polysemy, and insufficient learning of text context semantic information, a Chinese named entity recognition model based on BERT-BiGRU-Attention-CRF is proposed. Firstly, the BERT language model is used to pre-train the word vector to make up for the problem that the traditional word vector model cannot solve the problem of polysemy. Secondly, the bi-directional gated recurrent unit(BiGRU)neural network layer is applied to extract the features of the deep information of the text, to calculate the predicted score of each label to get the hidden state sequence of the sentence. Thirdly, the attention layer is utilized to weight the representations of the words and mine the association between the words to get new predicted scores and new state sequences. Finally, the conditional random field(CRF)is used to calculate the global optimal solution for the new prediction score, so as to obtain the final prediction result of the model on the entity label. Through the experiments with MSRA corpus, the results show that the new model is effective.
参考文献/References:
[1] Collobert R,Weston J,Bottou L,et al.Natural language processing(almost)from scratch [J].Journal of Machine Learning Research,2011,12(1):2493-2537.
[2] Graves A.Long Short-term memory[C].Supervised sequence labelling with recurrent neural Networks.Springer berlin heidelberg,2012:1735-1780.
[3] Cho K,Van Merrienboer B,Gulcehre C,et al.Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation[J].Computation and Language,2014,12(3).
[4] Huang Z,Xu W,Yu K. Bidirectional LSTM-CRFmodels for sequence tagging[J].Computation and Language,2015,3(4)1508-1991.
[5] Devlin J,Chang M W,Lee K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].Computation and Language,2018,9(3)1810-4805.
[6] 王月,王孟轩,张胜,等.基于BERT的警情文本命名实体识别[J].计算机应用.2019,11(20):1001-908.
[7] 古雪梅,刘嘉勇,程芃森,等.基于增强BiLSTM-CRF模型的推文恶意软件名称识别[J].计算机科学,2019,10(29):2096-4188.
[8] 杨飘,董文永.基于BERT嵌入的中文命名实体识别方法[J].计算机工程,2019,5(30):36-41.
[9] 王宁,李世林,刘堂亮,等.基于注意力机制BiGRU判决结果倾向性分析[J].计算机系统应用,2019,28(3):191-195.
[10] 王伟,孙玉霞,齐庆杰,等.基于BiGRU-attention神经网络的文本情感分类模型[J].计算机应用研究,2019,12(12):3559-3564.
[11] 王子牛,姜猛,高建瓴,等.基于BERT的中文命名实体识别方法[J].计算机科学,2019,11(3):139-142.
[12] 冀相冰,朱艳辉,李飞,等.基于Attention-BiLSTM的中文命名实体识别[J].湖南工业大学学报,2019,9(5):74-78.
[13] 石丹春,秦岭.基于BGRU-CRF的中文命名实体识别方法[J].计算机科学,2019,9(9):28-242.
[14] 李妮,关焕梅,杨飘,等.基于BERT-IDCNN-CRF的中文命名实体识别方法[J].山东大学学报(理学版),2020,1(2):1671-9352.
[15] 李扬,张伟,彭晨.目标依赖的作者身份识算法[J].计算机应用,2019,11(20):2-7.
相似文献/References:
[1]张孝峰,陶宏才.基于BERT的多信息融合方面级情感分析模型[J].成都信息工程大学学报,2024,39(04):397.[doi:10.16836/j.cnki.jcuit.2024.04.001]
ZHANG Xiaofeng,TAO Hongcai.Aspect-Level Sentiment Analysis Model based on BERT with Multi-Information Fusion[J].Journal of Chengdu University of Information Technology,2024,39(03):397.[doi:10.16836/j.cnki.jcuit.2024.04.001]
备注/Memo
收稿日期:2020-01-12 基金项目:国家自然科学基金资助项目(61806170)