HOU Xiangjun,CHEN Yajun.Human Pose Estimation Algorithm based on Improved YOLOv7-Pose[J].Journal of Chengdu University of Information Technology,2025,40(04):441-445.[doi:10.16836/j.cnki.jcuit.2025.04.005]
基于改进YOLOv7-Pose的人体姿态估计算法
- Title:
- Human Pose Estimation Algorithm based on Improved YOLOv7-Pose
- 文章编号:
- 2096-1618(2025)04-0441-05
- 关键词:
- 关键点检测; YOLOv7-Pose; DSConv; CBAM; AIFI
- Keywords:
- keypoint detection; YOLOv7-Pose; DSConv; CBAM; AIFI
- 分类号:
- TP391
- 文献标志码:
- A
- 摘要:
- 针对目前人体姿态估计模型对关键点识别精度有待提升、计算量较大等问题,提出一种基于YOLOv7-Pose的改进算法——AD-YOLOPose。使用DSConv卷积替换原3×3 卷积,保证模型精度的同时忽略次要信息,降低模型的计算量; 引入CBAM注意力机制,提高模型的特征描述能力,减少复杂环境信息的干扰; 将原SPPCSPC层替换为AIFI模块,在不影响模型性能的同时降低模型参数量和运算成本。COCO2017骨骼数据集上的实验结果表明,改进模型的F1值提高了2%,mAp@0.5提高了1.9%,mAp@0.5:0.95提高了3.9%,GFLOPs降低了约47.8%。
- Abstract:
- Aiming at the problems that the accuracy of keypoint identification of the current human posture estimation model needs to be improved and the computational complexity is large,an improved algorithm based on YOLOv7-Pose is proposed-AD-YOLOPose.Use DSConv convolution to replace the original 3×3 convolution to ensure the accuracy of the model while ignoring the minor information and reducing the calculation amount of the model; introduce the CBAM attention mechanism to improve the feature description ability of the model and reduce the interference of complex environmental information; the original SPPCSPC layer is replaced by the AIFI module,which reduces the number of model parameters and computing costs without affecting model performance.Experimental results on the COCO2017 skeleton dataset show that the F1 value of the improved model increased by 2 percentage points,mAp@0.5 increased by 1.9%,mAp@0.5:0.95 increased by 3.9%,and GFLOPs decreased by approximately 47.8%.
参考文献/References:
[1] 卢官明,卢峻禾,陈晨.基于深度学习的二维人体姿态估计研究进展[J/OL]. 南京邮电大学学报(自然科学版),http://kns.cnki.net/kcms/detail/32.1772.TN.20230913.1118.002.html,2024-01-14.
[2] Toshev A,Szegedy C.Deeppose:Human pose estimation via deep neural networks[C]. Proceedings of the IEEE conference on computer vision and pattern recognition,2014:1653-1660.
[3] Fang H S,Xie S,Tai Y W,et al.Rmpe:Regional multi-person pose estimation[C]. Proceedings of the IEEE International conference on computer vision,2017:2334-2343.
[4] Cao Z,Simon T,Wei S E,et al.Realtime multi-person 2d pose estimation using part affinity fields[C]. Proceedings of the IEEE conference on computer vision and pattern recognition.2017:7291-7299.
[5] Cheng B,Xiao B,Wang J,et all.Higherhrnet:Scale-aware representation learning for bottom-up human pose estimation[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition,2020:5386-5395.
[6] Maji D,Nagori S,Mathew M,et al.Yolo-pose:Enhancing yolo for multi person pose estimation using object keypoint similarity loss[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:2637-2646.
[7] Glenn jocher.YOLOv5[EB/OL]. https://github.com/ultralytics/yolov5,2021.
[8] Wang C Y,Bochkovskiy A,Liao H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2023:7464-7475.
[9] Nascimento M G,Fawcett R,Prisacariu V A.Dsconv:Efficient convolution operator[C]. Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:5148-5157.
[10] Woo S,Park J,Lee J Y,et al.Cbam:Convolutional block attention module[C]. Proceedings of the European conference on computer vision(ECCV),2018:3-19.
[11] Lv W,Xu S,Zhao Y,et al.Detrs beat yolos on real-time object detection[J]. arXiv preprint arXiv:2304.08069,2023.
[12] Lin T Y,Maire M,Belongie S,et al.Microsoft coco:Common objects in context[C]. Computer Vision-ECCV 2014:13th European Conference,Zurich,Switzerland,September 6-12,2014,Proceedings,Part V 13.Springer International Publishing,2014:740-755.
[13] Neff C,Sheth A,Furgurson S,et al.Efficienthrnet:Efficient scaling for lightweight high-resolution multi-person pose estimation[J]. arXiv preprint arXiv:2007.08090,2020.
[14] Geng Z,Sun K,Xiao B,et al.Bottom-up human pose estimation via disentangled keypoint regression[C]. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.2021:14676-14686.
[15] Li X,Guo Y,Pan W,et al.Human Pose Estimation Based on Lightweight Multi-Scale Coordinate Attention[J]. Applied Sciences,2023,13(6):3614.
备注/Memo
收稿日期:2024-01-14
基金项目:教育部产学合作协同育人资助项目(201802031076)
通信作者:陈亚军.E-mail:scnccyj@cwnu.edu.cn
