DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

Read original: arXiv:2405.19707 - Published 8/23/2024 by Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang and 1 other

DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

Overview

This paper introduces DeMamba, a novel AI-based approach for detecting AI-generated videos on a million-scale benchmark dataset.
The research aims to address the growing challenge of AI-generated "deepfake" videos, which can be used to spread misinformation and impersonate real people.
The paper proposes a new deep learning architecture and training strategy that significantly outperforms existing state-of-the-art methods for deepfake detection.

Plain English Explanation

The paper focuses on the problem of detecting AI-generated videos, also known as "deepfakes." Deepfakes are videos that have been manipulated using AI to make it look like someone said or did something they didn't. This can be used to spread misinformation or impersonate real people, which is a growing concern.

The researchers developed a new AI system called DeMamba that is very good at detecting these fake videos. They tested it on a huge dataset of over a million videos, both real and fake, and found that DeMamba significantly outperformed other leading methods for spotting deepfakes.

The key innovation in DeMamba is a new deep learning architecture and training approach that allows the system to better identify the subtle telltale signs that a video has been artificially generated. This makes DeMamba a powerful tool for combating the spread of misinformation and protecting people's online identities.

Technical Explanation

The paper introduces DeMamba, a novel AI-based approach for detecting AI-generated videos on a million-scale benchmark dataset. The researchers developed a deep learning architecture that leverages Mamba attention to capture spatio-temporal patterns indicative of synthetic video generation.

DeMamba is evaluated on a large-scale dataset of over 1 million real and AI-generated videos, outperforming existing state-of-the-art deepfake detection methods. The authors' experimental results demonstrate the effectiveness of their approach in reliably distinguishing AI-generated content from authentic videos, highlighting the potential of DeMamba for combating the growing challenge of deepfake media.

Critical Analysis

The paper provides a thorough evaluation of DeMamba on a large-scale, diverse dataset, which is a strength. However, the authors acknowledge that their approach may still struggle with highly realistic deepfakes generated by the latest AI models. Further research is needed to address this evolving challenge.

Additionally, the paper does not explore potential biases or failure modes of the DeMamba system, such as its performance on underrepresented demographic groups or edge cases. Future work should investigate these aspects to ensure the fairness and robustness of deepfake detection systems.

Overall, the DeMamba approach represents an important step forward in the ongoing battle against AI-generated misinformation. However, continued advancements in both deepfake generation and detection will be essential to stay ahead of this rapidly advancing field.

Conclusion

This paper introduces DeMamba, a novel AI-based system for detecting AI-generated "deepfake" videos on a large-scale benchmark dataset. The researchers' innovative deep learning approach significantly outperforms existing state-of-the-art methods, highlighting its potential as a powerful tool for combating the spread of misinformation and protecting online identities.

While the paper demonstrates the effectiveness of DeMamba, it also acknowledges the need for further research to address the evolving challenges posed by increasingly realistic deepfakes. Continued advancements in both deepfake generation and detection will be crucial to ensure the integrity of digital media in the years to come.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark

Haoxing Chen, Yan Hong, Zizheng Huang, Zhuoer Xu, Zhangxuan Gu, Yaohui Li, Jun Lan, Huijia Zhu, Jianfu Zhang, Weiqiang Wang, Huaxiong Li

Recently, video generation techniques have advanced rapidly. Given the popularity of video content on social media platforms, these models intensify concerns about the spread of fake information. Therefore, there is a growing demand for detectors capable of distinguishing between fake AI-generated videos and mitigating the potential harm caused by fake information. However, the lack of large-scale datasets from the most advanced video generators poses a barrier to the development of such detectors. To address this gap, we introduce the first AI-generated video detection dataset, GenVideo. It features the following characteristics: (1) a large volume of videos, including over one million AI-generated and real videos collected; (2) a rich diversity of generated content and methodologies, covering a broad spectrum of video categories and generation techniques. We conducted extensive studies of the dataset and proposed two evaluation methods tailored for real-world-like scenarios to assess the detectors' performance: the cross-generator video classification task assesses the generalizability of trained detectors on generators; the degraded video classification task evaluates the robustness of detectors to handle videos that have degraded in quality during dissemination. Moreover, we introduced a plug-and-play module, named Detail Mamba (DeMamba), designed to enhance the detectors by identifying AI-generated videos through the analysis of inconsistencies in temporal and spatial dimensions. Our extensive experiments demonstrate DeMamba's superior generalizability and robustness on GenVideo compared to existing detectors. We believe that the GenVideo dataset and the DeMamba module will significantly advance the field of AI-generated video detection. Our code and dataset will be aviliable at url{https://github.com/chenhaoxing/DeMamba}.

8/23/2024

Distinguish Any Fake Videos: Unleashing the Power of Large-scale Data and Motion Features

Lichuan Ji, Yingqi Lin, Zhenhua Huang, Yan Han, Xiaogang Xu, Jiafei Wu, Chong Wang, Zhe Liu

The development of AI-Generated Content (AIGC) has empowered the creation of remarkably realistic AI-generated videos, such as those involving Sora. However, the widespread adoption of these models raises concerns regarding potential misuse, including face video scams and copyright disputes. Addressing these concerns requires the development of robust tools capable of accurately determining video authenticity. The main challenges lie in the dataset and neural classifier for training. Current datasets lack a varied and comprehensive repository of real and generated content for effective discrimination. In this paper, we first introduce an extensive video dataset designed specifically for AI-Generated Video Detection (GenVidDet). It includes over 2.66 M instances of both real and generated videos, varying in categories, frames per second, resolutions, and lengths. The comprehensiveness of GenVidDet enables the training of a generalizable video detector. We also present the Dual-Branch 3D Transformer (DuB3D), an innovative and effective method for distinguishing between real and generated videos, enhanced by incorporating motion information alongside visual appearance. DuB3D utilizes a dual-branch architecture that adaptively leverages and fuses raw spatio-temporal data and optical flow. We systematically explore the critical factors affecting detection performance, achieving the optimal configuration for DuB3D. Trained on GenVidDet, DuB3D can distinguish between real and generated video content with 96.77% accuracy, and strong generalization capability even for unseen types.

5/27/2024

🔎

Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method

Peisong He, Leyao Zhu, Jiaxing Li, Shiqi Wang, Haoliang Li

The generative model has made significant advancements in the creation of realistic videos, which causes security issues. However, this emerging risk has not been adequately addressed due to the absence of a benchmark dataset for AI-generated videos. In this paper, we first construct a video dataset using advanced diffusion-based video generation algorithms with various semantic contents. Besides, typical video lossy operations over network transmission are adopted to generate degraded samples. Then, by analyzing local and global temporal defects of current AI-generated videos, a novel detection framework by adaptively learning local motion information and global appearance variation is constructed to expose fake videos. Finally, experiments are conducted to evaluate the generalization and robustness of different spatial and temporal domain detection methods, where the results can serve as the baseline and demonstrate the research challenge for future studies.

5/8/2024

A Large-scale Universal Evaluation Benchmark For Face Forgery Detection

Yijun Bei, Hengrui Lou, Jinsong Geng, Erteng Liu, Lechao Cheng, Jie Song, Mingli Song, Zunlei Feng

With the rapid development of AI-generated content (AIGC) technology, the production of realistic fake facial images and videos that deceive human visual perception has become possible. Consequently, various face forgery detection techniques have been proposed to identify such fake facial content. However, evaluating the effectiveness and generalizability of these detection techniques remains a significant challenge. To address this, we have constructed a large-scale evaluation benchmark called DeepFaceGen, aimed at quantitatively assessing the effectiveness of face forgery detection and facilitating the iterative development of forgery detection technology. DeepFaceGen consists of 776,990 real face image/video samples and 773,812 face forgery image/video samples, generated using 34 mainstream face generation techniques. During the construction process, we carefully consider important factors such as content diversity, fairness across ethnicities, and availability of comprehensive labels, in order to ensure the versatility and convenience of DeepFaceGen. Subsequently, DeepFaceGen is employed in this study to evaluate and analyze the performance of 13 mainstream face forgery detection techniques from various perspectives. Through extensive experimental analysis, we derive significant findings and propose potential directions for future research. The code and dataset for DeepFaceGen are available at https://github.com/HengruiLou/DeepFaceGen.

6/17/2024