Options
On the effectiveness of handcrafted features for deepfake video detection
Journal
Journal of Electronic Imaging
Date Issued
2023
Author(s)
Kaddar, Bachir
Fezza, Sid Ahmed
Hamidouche, Wassim
Akhtar, Zahid
Abstract
Recent developments in advanced generative deep learning techniques have led to considerable progress in deepfake technology. CNN-based deepfake detection approaches have demonstrated superior performance. The ability to learn meaningful representations generated by convolutional multilayer nonlinear structures is the key to success. However, the black-box nature of such approaches has been a major concern for exploring hidden and complex characteristics as well as potential limitations of CNN-based models. To gain insights into the scope of the deepfake detection task, we investigate the effectiveness of handcrafted feature-based methods for deepfake video detection. First, we experiment with six top-performing handcrafted descriptors to extract the discriminating image features and then train SVMs on the extracted features to learn a suitable model. We also study the effect of selecting specific facial components on the detection performance. Specifically, we consider features extracted from the left eye, right eye, mouth, and entire face. Moreover, we propose a combination of these features and highlight the importance of this combination in terms of detection performance. Experimental results show that the SIFT feature descriptor achieves the best performance on deepfake videos generated by the neural texture technique, with a detection accuracy of 83.50%, which is better than deep learning-based methods. This is in contrast to the conventional understanding that deep learning methods systematically outperform handcrafted feature-based approaches. In addition, the obtained results on the FaceForensics++ dataset highlight the benefit of using some facial components to further boost the detection performance. Moreover, motivated by the effectiveness of the LBPTOP and SIFT in the deepfake detection task, we combined the LBPTOP and SIFT to best characterize the specific spatiotemporal inconsistencies commonly found in fake videos for boosting deepfake detection performance. Finally, we show the strengths and weaknesses of methods based on handcrafted features for deepfake detection and provide directions for future research.
Scopus© citations
0
Acquisition Date
Sep 11, 2024
Sep 11, 2024