LIN Dizhong,ZOU Shurong,FU Ying.Attention-based Class Encoding for Few-Shot Object Detection[J].Journal of Chengdu University of Information Technology,2022,37(05):527-532.[doi:10.16836/j.cnki.jcuit.2022.05.007]
基于注意力类特定编码的小样本目标检测
- Title:
- Attention-based Class Encoding for Few-Shot Object Detection
- 文章编号:
- 2096-1618(2022)05-0527-06
- Keywords:
- few-shot object detection; class-specific encoding; attention mechanism; base class; new class
- 分类号:
- TP756
- 文献标志码:
- A
- 摘要:
- 基于CNN的小样本目标检测网络在两阶段元训练注入少量新类图像时,不混合基类进行训练已成为一种趋势,这样能高效地向模型注入新类。而在这种增量式训练方式下,由于输入的新类别样本量少,模型由于泛化性能不足,易错检新注入的类别数据为模型训练过的种类。基于此,在CenterNet框架上设计了一种新的小样本目标检测器,能快速高效地进行检测。检测器引入了一个重要组件:对图像做有效增强处理后提取类表征信息的注意力类编码器,能有效地提升网络对新类的编码性能,从而增强模型对新类的泛化能力。实验结果表明,方法在一些场景下优于近期比较流行的小样本目标检测框架。
- Abstract:
- Recent research shows that when two-stage few-shot object detection networks enroll a small number of new classes of images, the second stage of meta-training without base classes can efficiently enroll new classes into the network. However, under this incremental meta-training method, due to the small amount of input novel class samples, the model is prone to incorrectly detect the newly injected category data as base classes trained by the model due to insufficient generalization performance.In this paper, we design a new few-shot object detector within the CenterNet framework, which can detect objects quickly and efficiently. Our detector introduces an important component: an attention class encoder that extracts class representation from augmented images, which plays an important role in improving the generalization performance of the detection model for new classes. Experimental results show that our proposed method is better than the recently state-of-art few-shot object detection framework in some scenarios.
参考文献/References:
[1] Ren S,He K,Girshick R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks[J].Advances in neural information processing systems,2015,28.
[2] Redmon J,Divvala S,Girshick R,et al.You only look once:Unified,real-time object detection[C].Proceedings of the IEEE conference on computer vision and pattern recognition,2016:779-788.
[3] Liu W,Anguelov D,Erhan D,et al.Ssd:Single shot multibox detector[C].European conference on computer vision.Springer,Cham,2016:21-37.
[4] Ravi S,Larochelle H.Optimization as a model for few-shot learning[J].2016.
[5] Wang Xin,Thomas E Huang,Trevor Darrell,et al.Frustratingly simple few-shot object detection[C].In International Conference on Machine Learning(ICML),2020.
[6] Juan-Manuel,Perez-Rua,Xiatian zhu,et al.Incremental few-shot object detection[C].CVPR,2020.
[7] Zhou X,Wang D,Krähenbühl P.Objects as points[J].arXiv preprint arXiv,2019.
[8] He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C].Proceedings of the IEEE conference on computer vision and pattern recognition,2016:770-778.
[9] Hu J,Shen L,Sun G.Squeeze-and-excitation networks[C].Proceedings of the IEEE conference on computer vision and pattern recognition,2018:7132-7141.
[10] Woo S,Park J,Lee J Y,et al.Cbam:Convolutional block attention module[C].Proceedings of the European conference on computer vision(ECCV),2018:3-19.
[11] Kang B,Liu Z,Wang X,et al.Few-shot object detection via feature reweighting[C].Proceedings of the IEEE/CVF International Conference on Computer Vision,2019:8420-8429.
[12] Zhang T,Zhang Y,Sun X,et al.Comparison network for one-shot conditional object detection[J].arXiv preprint arXiv:2019.
[13] Fan Q,Zhuo W,Tang C K,et al.Few-shot object detection with attention-RPN and multi-relation detector[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,2020:4013-4022.
[14] Wang X,Girshick R,Gupta A,et al.Non-local neural networks[C].Proceedings of the IEEE conference on computer vision and pattern recognition,2018:7794-7803.
[15] Zagoruyko S,Komodakis N.Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer[J].arXiv preprint arXiv:2016.
[16] Chen L,Zhang H,Xiao J,et al.Sca-cnn:Spatial and channel-wise attention in convolutional networks for image captioning[C].Proceedings of the IEEE conference on computer vision and pattern recognition,2017:5659-5667.
[17] Wang F,Jiang M,Qian C,et al.Residual attention network for image classification[C].Proceedings of the IEEE conference on computer vision and pattern recognition,2017:3156-3164.
[18] Lin T Y,Maire M,Belongie S,et al.Microsoft coco:Common objects in context[C].European conference on computer vision.Springer,Cham,2014:740-755.
[19] Everingham M,Van Gool L,Williams C K I,et al.The pascal visual object classes(voc)challenge[J].International journal of computer vision,2010,88(2):303-338.
备注/Memo
收稿日期:2022-01-10
基金项目:四川省科技厅重大科技专项资助项目(2019ZDZX0005)