PU Wenbo,HU Jing.A Deepfake Detection Method based on Frequency Domain Information[J].Journal of Chengdu University of Information Technology,2022,37(05):508-514.[doi:10.16836/j.cnki.jcuit.2022.05.004]
基于频域信息的深度伪造检测算法
- Title:
- A Deepfake Detection Method based on Frequency Domain Information
- 文章编号:
- 2096-1618(2022)05-0508-07
- Keywords:
- deepfake detection; frequency learning; temporal learning
- 分类号:
- TP391
- 文献标志码:
- A
- 摘要:
- 深度伪造技术作为人脸窜改技术的一种,由它合成的换脸视频已经对隐私安全带来了巨大的隐患。现存的深度伪造检测方法通常基于传统的卷积神经网络提取合成视频中空间域的不连续信息,以判断是否为深度伪造视频。随着深度伪造技术的迭代,传统检测方法精度难以取得显著提升。与传统方法不同,文本将合成视频帧进行离散余弦变换,获得视频帧图像的频域表示,使用残差卷积网络学习频域特征,并通过双向LSTM提取帧间不连续信息,从而检测视频帧是否伪造。此外,针对深度伪造数据提出了一种新的数据增强方法Xray-blur,降低换脸视频的空间域不连续性,从而提升训练难度,加强模型对不连续信息的捕获能力。实验表明,该方法在公开数据集Celeb-DF和FaceForensics++上取得了优秀的准确率(ACC)和ROC曲线下面积(AUC),且在面对低质量视频时,具有更好的鲁棒性。
- Abstract:
- As a kind of face manipulation technology,the widespread popularity of Deepfake has brought huge hidden dangers to privacy and security. Existing Deepfake detection methods are based on a traditional convolutional neural network to extract spatial discontinuous information in the spatial domain in synthetic videos to judge whether a video is a deepfake. With the iteration of Deepfake, the accuracy of traditional detection methods cannot be significantly improved. Different from these methods, this paper performs discrete cosine transform(DCT) on Deepfake frames to obtain the frequency domain representation, uses a modified residual convolutional network to learn the frequency domain features, and extracts temporal information between Deepfake frames through bidirectional LSTM. In addition, this paper proposes a new data augmentation method called Xray-blur for Deepfake data, which reduces the spatial discontinuity of Deepfake data and enhances the model’s ability to capture discontinuous information. Experiments show that this method achieves excellent accuracy and AUC on public datasets of Celeb-DF and FaceForensics++ and has better robustness against low-quality videos.
参考文献/References:
[1] Rossler A,Cozzolino D,Verdoliva L,et al. Faceforensics++:Learning to detect manipulated facial images[C].Proceedings of the IEEE/CVF International Conference on Computer Vision.Long Beach:IEEE,2019:1-11.
[2] Collobert R,Weston J.A unified architecture for natural language processing:Deep neural networks with multitask learning[C].Proceedings of the 25th international conference on Machine learning. New York:Association for Computing Machinery,2008:160-167.
[3] Afchar D,Nozick V,Yamagishi J,et al.Mesonet: a compact facial video forgery detection network[C].Proceedings of 2018 IEEE international workshop on information forensics and security(WIFS).Hong Kong:IEEE,2018:1-7.
[4] Li Yuezun,Siwei Lyu.Exposing deepfake videos by detecting face warping artifacts[EB/OL].(2019-05-22)[2022-05-18].https://arxiv.org/abs/1811.00656.
[5] He Kaiming,Zhang Xiangyu,Ren Shaoqing,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE trans on pattern analysis and machine intelligence,2015,37(9):1904-1916.
[6] Nguyen H H,Yamagishi J,Echizen I.Capsule-forensics:Using capsule networks to detect forged images and videos[C].Proceedings of IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).Brighton:IEEE,2019:2307-2311.
[7] Sabour S,Frosst N,Hinton G E.Dynamic routing between capsules[EB/OL].(2017-11-08)[2022-04-18].https://arxiv.org/abs/1710.09829.
[8] Simonyan K,Zisserman A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2015-04-11)[2022-05-18].https://arxiv.org/abs/1409.1556.
[9] Luo Y,Zhang Y,Yan J,et al.Generalizing face forgery detection with high-frequency features[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:16317-16326.
[10] Li Y,Chang M C,Lyu S.In ictu oculi:Exposing ai created fake videos by detecting eye blinking[C].2018 IEEE International Workshop on Information Forensics and Security(WIFS).IEEE,2018:1-7.
[11] Hernandez-Ortega J,Tolosana R,Fierrez J,et al.Deepfakeson-phys:Deepfakes detection based on heart rate estimation[EB/OL].(2020-05-14)[2022-05-18].https://arxiv.org/abs/2010.00400.
[12] Güera D,Delp E J.Deepfake video detection using recurrent neural networks[C].Proceedings of 2018 15th IEEE international conference on advanced video and signal based surveillance(AVSS).Auckland:IEEE,2018:1-6.
[13] Donahue J,Hendricks LA,Rohrbach M, et al.Long-Term Recurrent Convolutional Networks for Visual Recognition and Description[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(4):677-691.
[14] 韩语晨,华光,张海剑.基于Inception3D网络的眼部与口部区域协同视频换脸伪造检测[J].信号处理,2021,37(4):567-577.
[15] Szegedy C,Liu W,Jia Y,et al.Going deeper with convolutions[C].Proceedings of the IEEE conference on computer vision and pattern recognition.2015:1-9.
[16] King D E.Dlib-ml: A machine learning toolkit[J].The Journal of Machine Learning Research,2009,10:1755-1758.
[17] He K,Zhang X,Ren S,et al.Deep residual learning for image recognition[C].Proceedings of the IEEE conference on computer vision and pattern recognition.Las Vegas:IEEE,2016:770-778.
[18] Li L,Bao J,Zhang T,et al.Face x-ray for more general face forgery detection[C].Proceedings of the IEEE/CVF conference on computer vision and pattern recognition.Seattle:IEEE,2020:5001-5010.
[19] Li Y,Yang X,Sun P,et al.Celeb-df:A large-scale challenging dataset for deepfake forensics[C].Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle:IEEE,2020:3207-3216.
备注/Memo
收稿日期:2022-03-08
基金项目:国家自然基金重点资助项目(42130608); 国家自然基金资助项目(61602065); 四川省科技厅重点研发资助项目(2021YFG0038)