PUBLICATION论文 · IJIGSP · 2019

Three-dimensional Region Forgery Detection and Localization in Videos 视频中三维区域伪造检测与定位

X. H. Nguyen1,2 Y. Hu1 M. A. Amin1 K. G. Hayat1 V. T. Le2 D. T. Truong3
1Research Centre of Multimedia Information Security Detection and Intelligent Processing, School of Electronics and Information Engineering, South China University of Technology, Guangzhou 510640, P.R. China华南理工大学电子信息学院多媒体信息安全检测与智能处理研究中心,广州 510640,中国 2Faculty of Electronics and Informatics Engineering, Mien Trung Industrial and Trade College, Phu Yen 620000, Vietnam越南富安省中越工业贸易学院电子与信息工程系 3Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh 700000, Vietnam越南胡志明市孙德胜大学信息技术学院
№ P1

ABSTRACT摘要

What is the problem and what did we do? 我们解决了什么问题

With the extensive use of cameras and rapidly developing video editing software, millions of videos are uploaded daily and forgery has become easier than ever. While detecting forged videos has gained research attention, very limited studies address the detection of videos with duplicated three-dimensional (3-D) regions, a popular tampering method used to hide or duplicate objects and actions across consecutive frames. 随着摄像头的广泛使用和视频编辑软件的快速发展,每天有数百万视频上传,伪造变得比以往任何时候都更容易。虽然检测伪造视频已引起研究关注,但针对复制三维(3-D)区域视频检测的研究非常有限——这是一种用于隐藏或复制连续帧中对象和动作的流行篡改方法。

We propose a new method for detecting and locating 3-D duplicated regions in videos based on phase-correlation of 3-D frame residuals. The algorithm uses frame difference residuals to eliminate linear effects (e.g., brightness changes), then applies Fourier transformation and phase-correlation between 3-D regions to identify duplicated content. A two-threshold decision mechanism (Max_Threshold ≥ 0.55, Min_Threshold = 0.3) with expanded-area verification handles ambiguous cases, followed by block-level correlation for precise localization. 我们提出了一种基于三维帧残差相位相关的视频中三维复制区域检测与定位新方法。该算法使用帧差残差消除线性效应(如亮度变化),然后对三维区域应用傅里叶变换和相位相关来识别复制内容。双阈值决策机制(最大阈值≥0.55,最小阈值=0.3)配合扩展区域验证处理模糊情况,随后通过块级相关实现精确定位。

Evaluations on two realistic datasets, VFDD-3D (high-resolution, 100 videos, 10 cameras) and REWIND-3D (low-resolution, 40 videos), demonstrate the method is efficient and robust for detecting both small 3-D region duplication and full frame sequence duplication, with localization accuracy exceeding 99% on both datasets. 在两个真实数据集,VFDD-3D(高分辨率,100个视频,10个摄像头)和REWIND-3D(低分辨率,40个视频),上的评估表明,该方法对检测小型三维区域复制和完整帧序列复制均高效鲁棒,在两个数据集上的定位准确率均超过99%

№ P2

METHOD方法

Phase-correlation of 3-D residuals for detection and localization. 三维残差相位相关用于检测定位

01

Residual Computation & 3-D Region Division残差计算与三维区域划分

The video is converted to grayscale and frame residuals R are computed as the pixel-wise difference between adjacent frames: R = vt+1 − vt. Using residuals instead of raw frames eliminates linear effects such as brightness changes and enhances pixel-to-pixel differences for Fourier analysis. The residual video is then divided into overlapping 3-D regions of size H × W × T (e.g., 80 × 128 × 30 for small-region detection, or full-frame × 30 for sequence duplication). 视频转换为灰度图,帧残差R计算为相邻帧之间的像素级差值:R = vt+1 − vt。使用残差而非原始帧可消除亮度变化等线性效应,并增强傅里叶分析的像素差异。残差视频随后被划分为重叠的三维区域,大小为H × W × T(例如,小型区域检测为80 × 128 × 30,序列复制为全帧 × 30)。

02

Phase-Correlation & Dual-Threshold Detection相位相关与双阈值检测

For each pair of 3-D regions Bn and Bm, the maximum phase-correlation K is computed via inverse FFT of the normalized cross-power spectrum. A dual-threshold decision is applied: if K > Max_Threshold (0.55), the regions are suspected duplicates; if K ≤ Min_Threshold (0.3), they are not. For ambiguous cases (0.3 < K ≤ 0.55), phase-correlation is recomputed on expanded areas (striding 3 pixels, extended by half the region size), and confirmed if Max(K) > Max_Threshold or Max(K)/Average_All(K) > C_Ratio (20). 对于每对三维区域Bn和Bm,通过归一化互功率谱的逆FFT计算最大相位相关K。应用双阈值决策:若K > 最大阈值(0.55),则区域为疑似复制;若K ≤ 最小阈值(0.3),则不是。对于模糊情况(0.3 < K ≤ 0.55),在扩展区域(步长3像素,扩展半区域大小)上重新计算相位相关,若Max(K) > 最大阈值或Max(K)/Average_All(K) > C_Ratio(20)则确认。

03

Block-Level Localization块级定位

Each suspected duplicate pair is tiled into 8 × 8 × 10 overlapping blocks. Correlation coefficients between corresponding block pairs are computed; blocks with correlation > 0.9 are marked as duplicated. Four-connectivity is applied to remove isolated false positives, yielding precise localization of the forged 3-D region. This achieves 99.95% accuracy on VFDD-3D and 99.29% on REWIND-3D for small-region duplication. 每对疑似复制区域被划分为8 × 8 × 10的重叠块。计算对应块对之间的相关系数;相关系数>0.9的块标记为复制。应用四连通性去除孤立假阳性,实现伪造三维区域的精确定位。这在VFDD-3D上达到99.95%准确率,在REWIND-3D小型区域复制上达到99.29%

№ P3

RESULTS结果

High accuracy across resolutions and forgery types. 分辨率伪造类型的高准确率。

Small 3-D Region Duplication (VFDD-3D)小型三维区域复制(VFDD-3D)

On the high-resolution VFDD-3D dataset (404×720 to 1080×1920, 10 cameras, MP4 format), the optimal 3-D region size of 80 × 128 × 30 achieves the highest AUC of 0.928, outperforming Wang & Farid [6], Singh & Singh [8], and Bestagini et al. [2]. Region size selection is critical: too small increases false positives, too large increases false negatives. 在高分辨率VFDD-3D数据集(404×720至1080×1920,10个摄像头,MP4格式)上,最优三维区域大小80 × 128 × 30达到最高AUC 0.928,超越Wang & Farid [6]、Singh & Singh [8]和Bestagini et al. [2]。区域大小选择至关重要:过小增加假阳性,过大增加假阴性。

Frame Sequence Duplication (VFDD-3D)帧序列复制(VFDD-3D)

For full-frame sequence duplication, larger spatial regions perform better. Using frame size × 30 achieves AUC 0.975, surpassing methods based on correlation coefficients of gray values [5], spatial-temporal correlation [6], and DCT mean correlation [8]. The phase-correlation approach robustly handles the larger temporal extent of frame-sequence forgeries. 对于完整帧序列复制,更大的空间区域表现更好。使用帧大小 × 30达到AUC 0.975,超越基于灰度值相关系数[5]、时空相关[6]和DCT均值相关[8]的方法。相位相关方法对帧序列伪造的更大时间范围具有鲁棒性。

Low-Resolution Robustness (REWIND-3D)低分辨率鲁棒性(REWIND-3D)

On the low-resolution REWIND-3D dataset (320×240, 3 cameras), the optimal size 48 × 64 × 30 achieves AUC 0.995 for small-region duplication and AUC 0.995 for frame-sequence duplication. The method maintains high accuracy even with significantly lower video quality, demonstrating strong cross-resolution generalization. 在低分辨率REWIND-3D数据集(320×240,3个摄像头)上,最优大小48 × 64 × 30对小型区域复制达到AUC 0.995,对帧序列复制达到AUC 0.995。即使在显著较低的视频质量下,该方法仍保持高准确率,展示了强跨分辨率泛化能力。

Forgery Localization伪造定位

Localization accuracy is 99.95% on VFDD-3D (small regions), 99.35% on VFDD-3D (frame sequences), 99.29% on REWIND-3D (small regions), and 99.56% on REWIND-3D (frame sequences). The block-level correlation with connectivity filtering precisely identifies duplicated voxels, enabling pixel-accurate forgery maps. 定位准确率在VFDD-3D(小型区域)上为99.95%,VFDD-3D(帧序列)上99.35%,REWIND-3D(小型区域)上99.29%,REWIND-3D(帧序列)上99.56%。带连通性滤波的块级相关精确定位复制体素,实现像素级精确的伪造图。

№ P4

BIBTEX引用

Cite this paper. 引用此论文

@article{nguyen2019three,
  author    = {Nguyen, Xuan Hau and Hu, Yongjian and Amin, Muhammad Ahmad and Hayat, Khan Gohar and Le, Van Thinh and Truong, Dinh Tu},
  title     = {Three-dimensional Region Forgery Detection and Localization in Videos},
  journal   = {International Journal of Image, Graphics and Signal Processing},
  year      = {2019},
  volume    = {11},
  number    = {12},
  pages     = {1--13},
  publisher = {MECS Press},
  doi       = {10.5815/ijigsp.2019.12.01}
}