ZHENG Xiaoxu,SHU Shanshan,WEN Chengyu.Handwritten Text Recognition based on Attentional Multi-branch Convolution and Transformer[J].Journal of Chengdu University of Information Technology,2023,38(06):649-655.[doi:10.16836/j.cnki.jcuit.2023.06.005]
基于注意力多分支卷积和Transformer的手写文本识别
- Title:
- Handwritten Text Recognition based on Attentional Multi-branch Convolution and Transformer
- 文章编号:
- 2096-1618(2023)06-0649-07
- 关键词:
- 手写文本识别; Transformer; 注意力机制; 链接时序分类
- Keywords:
- handwriting text recognition; Transformer; attention mechanism; connectionist temporal classification
- 分类号:
- TP391
- 文献标志码:
- A
- 摘要:
- 手写体识别技术作为自动阅卷的关键一环受到广泛研究。针对中文手写文本字迹复杂的问题,提出一种文本定位和识别的手写汉字文本识别方法。在文本定位信息中使用透视变化纠正倾斜的文本,特征提取阶段使用注意力多分支卷积层提取文本图像关键区域特征以及多尺度特征融合,语义提取阶段通过时间卷积网络和Transformer编码器构建序列信息和建模上下文语义信息,最后以链接时序分类函数,实现序列特征和字符序列标签对齐。所提方法在公开数据集CASIA-HWDB上进行实验,结果表明,注意力分支卷积层和语义提取层有效提升算法性能,证明所提方法的可行性。
- Abstract:
- The handwriting recognition technology has been widely studied as a key part of the automatic paper marking. A handwritten Chinese text recognition method for text localization and recognition is proposed for the problem of complex handwriting of Chinese handwritten text. The text localization information is corrected by using perspective change for skewed text, followed by feature extraction stage using attentional multi-branch convolutional layer to extract key region features of text images and multi-scale feature fusion, semantic extraction stage by constructing sequence information and modeling contextual semantic information through temporal convolutional network and Transformer encoder, and finally by connecting temporal classification functions to achieve sequence features and character sequence label alignment. The proposed method is investigated using the publicly available dataset CASIA-HWDB, and the results show that the attention branching convolutional layer and the semantic extraction layer can effectively improve the algorithm performance, which verifies the feasibility of the proposed method.
参考文献/References:
[1] Kumar M,Jindal M,Sharma R.Segmentation of isolated and touching characters in offline handwritten Gurmukhi script recognition[J].International Journal of Information Technology Computer Science,2014,6(2): 58-63.
[2] Qiufeng Wang,Fei Yin,Chenglin Liu. Handwritten chinese text recognition by integrating multiple contexts[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(8):1469-1481.
[3] Su T H,Zhang T W,Guan D J,et al.Offline recognition of realistic Chinese handwriting using segmentation-free strategy[J].Pattern Recognition,2009,42(1):167-182.
[4] Shi B G,Bai X,Yao C.An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(11):2298-2304.
[5] Ronaldo Messina,Jerome Louradour.Segmentation-free handwritten Chinese text recognition with LSTM-RNN[C].International Conference on Document Analysis and Recognition,2015:171-175.
[6] Yichao Wu,Xiaolin Hu.From Textline to Paragraph:A promising practice for Chinese text recognition[C].Proceedings of the Future Technologies Conference,2020:618-633.
[7] Yousef Mohamed,Bishop Tom E.OrigamiNet: Weakly-supervised, segmentation-free,one-step,full page text recognition by learning to unfold[C].IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:14710-14719.
[8] Wang W.PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,9:5349-5367.
[9] He K,Zhang X,Ren S,et al.Deep Residual Learning for Image Recognition[C].IEEE Conference on Computer Vision and Pattern Recognition(CVPR),2016:770-778.
[10] 张宸嘉,朱磊,俞璐.卷积神经网络中的注意力机制综述[J].计算机工程与应用,2021,57(20).
[11] Vaswani A,Shazeer N,Parmar N,et al.Attention is all you need[C].Proceedings of the 31st International Conference on Neural Information Processing Systems.Long Beach: MIT Press,2017:5998-6008.
[12] Vladimir I Levenshtein.Binary codes capable of correcting deletions,insertions,and reversals[J].Soviet Physics Doklady,1966,10(8):707-710.
[13] Xie C,Lai S,Liao Q,et al.High Performance Offline Handwritten Chinese Text Recognition with a New Data Preprocessing and Augmentation Pipeline[C].Document Analysis Systems.DAS 2020.Lecture Notes in Computer Science,2020:12116.
[14] 王馨悦,董兰芳.Attention机制在脱机中文手写体文本行识别中的应用[J].小型微型计算机系统,2019,40(9):1876-1880.
[15] Yintong Wang,Yingjie Yang,Weiping Ding,et al.A residual-attention offline handwritten Chinese text recogni-tion based on fully convolutional neural networks[J].IEEE Access,2021,9:132301-132310.
[16] Xiao S,Peng L,Yan R,et al.Deep Network with Pixel-Level Rectification and Robust Training for Handwriting Recognition[C].International Conferen-ce on Document Analysis and Recognition(ICDAR),2019:9-16.
备注/Memo
收稿日期:2022-12-09