SHU Shanshan,ZHENG Xiaoxu,WEN Chengyu.Improved Chinese Handwritten Text Line Recognition based on CRNN[J].Journal of Chengdu University of Information Technology,2023,38(04):422-428.[doi:10.16836/j.cnki.jcuit.2023.04.008]
基于CRNN改进的中文手写体文本行识别
- Title:
- Improved Chinese Handwritten Text Line Recognition based on CRNN
- 文章编号:
- 2096-1618(2023)04-0422-07
- 关键词:
- 手写体识别; 卷积循环神经网络; 卷积模块的注意力机制模块; 双向长短时记忆网络; 连接时序分类
- Keywords:
- handwritten chinese text recognition; CRNN; CBAM; BiLSTM; CTC
- 分类号:
- TP39
- 文献标志码:
- A
- 摘要:
- 中文手写体文本行识别可以将纸质书写内容转换为可编辑的电子内容。对于手写体书写随意性大、中文字符种类多, 且基于字符分割的方法识别准确率不高这些问题, 提出基于卷积循环神经网络改进的端到端的中文手写体识别方法。首先将图片传入基于改进的Inception结构的特征提取网络, 该网络首先改进GoogLeNet模型, 然后在此基础上又改进添加卷积模块的注意力机制模块和Inception组合结构, 改进后的模型能更好地提取图片的有效特征; 之后将提取到的图片特征传入循环层, 即两层双向长短时记忆网络进行预测; 最后将预测序列传入转录层, 经过连接时序分类进行转录输出。在CASIA-HWDB2数据集的实验结果表明, 该方法能获得95.12%的识别准确率, 证明方法的可行性。
- Abstract:
- Chinese handwritten text line recognition converts paper writing into editable electronic content. For the problems of random handwriting, the variety of Chinese characters, and low recognition accuracy of the method based on character segmentation. This paper proposes an improved end-to-end Chinese handwriting recognition method based on Convolutional Recurrent Neural Network(CRNN). First, the picture is passed to the feature extraction network based on the improved Inception structure, the network first improved the GoogLeNet model, and then added the attention mechanism module(CBAM)and the Inception combined structure, after the improvement the model can do better in extracting the effective features of the picture. Then the extracted picture features were passed to the recurrent layer, a two-layer bidirectional long-short-term memory network(BiLSTM), for prediction. Finally, the predicted sequence was passed to the transcription layer, the Connectionist Temporal Classification(CTC), for transcriptional output. Experiments use the CASIA-HWDB2 dataset. The results show that the method can obtain a recognition accuracy of 95.12%, which proves the feasibility of the method.
参考文献/References:
[1] 金连文, 钟卓耀, 杨钊, 等.深度学习在手写汉字识别中的应用综述[J].自动化学报, 2016, 42(8):1125-1141.
[2] Messina R, Louradour J.Segmentation-free handwritten Chinese text recognition with LSTM-RNN[C].Proceedings of the 13th IAPR International Conference on Document Analysis and Recognition, IEEE. 2015:171-175.
[3] Shi B, Xiang B, Cong Y. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition[J]. Proceedings of IEEE Transactions on Pattern Analysis & Machine Intelligence, 2016, 39(11):2298-2304.
[4] 周飞燕, 金林鹏, 董军.卷积神经网络研究综述[J].计算机学报, 2017, 40(6):1229-1251.
[5] T N Sainath, A. Mohamed, B. Kingsbury and B. Ramabhadran, Deep convolutional neural networks for LVCSR[C]. Proceedings of 2013 IEEE Interna-tional Conference on Acoustics, Speech and Signal Processing. 2013:8614-8618.
[6] 夏瑜潞.循环神经网络的发展综述[J].电脑知识与技术:经验技巧, 2019, 15(21):182-184.
[7] Graves Alex, Santiago Fernández, Faustino J, et al.Connectionist Temporal Classifi-cation: Labelling Unsegmented Sequence Data with Recurrent Neural Networks[C]. Proceedings of the 23rd International Conference on Machine Learning, ICML 2006, 2006:369-376.
[8] 蔡斯琪.不定长中文文本图像的识别算法研究[D].北京:北京交通大学, 2021.
[9] 张显杰, 张之明.基于卷积神经网络和Transformer的手写体英文文本识别[J/OL].计算机应用. https://kns.cnki.net/kcms/detail/51.1307.tp.20220304.1230.006.html.
[10] Wojna Z, Gorban A N, Lee D S, et al. Attention-based Extraction of Structured Information from Street View Im-agery[C].Proceedings of IEEE Computer Society, IEEE Computer Society, IEEE Computer Society.2017:844-850.
[11] Liem H D, Minh N D, Trung N B, et al. FVI: An End-to-end Vietnamese Identification Card Detection and Recognition in Images[C].Proceedings of 2018 5th NAFOSTED Conference on Information and Computer Science(NICS).IEEE, 2018:338-340.
[12] 刘高洪, 孙博洋, 刘宗伟, 等.基于CRNN的车牌识别方法[J].计算机科学与应用, 2021, 11(11):2804-2816.
[13] Woo S, Park J, Lee J Y, et al. CBAM: Convolutional Block Attention Module[C].Proceedings of 15th European Conference on Computer Vision, ECCV 2018, 2018: 3-19.
[14] Szegedy C, Ioffe S, Vanhoucke V, et al. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning[C].Proceedings of 31st AAAI Conference on Artificial Intelligence, AAAI 2017, 2017:4278-4284.
[15] 杨亚锋.基于CNN-RNN框架的脱机手写中文文本行识别模型及其加速和压缩方法的研究[D].广州:华南理工大学, 2019.
[16] 石鑫, 董宝良, 王俊丰.基于CRNN的中文手写识别方法研究[J].信息化建设.2019, 43(11):141-144.
[17] 马洋洋.基于深度学习的端到端脱机手写体识别技术研究[D].西安:陕西师范大学, 2021.
备注/Memo
收稿日期:2022-08-19