We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Multi-Scale Wavelet Transformer for Face Forgery Detection

Abstract: Currently, many face forgery detection methods aggregate spatial and frequency features to enhance the generalization ability and gain promising performance under the cross-dataset scenario. However, these methods only leverage one level frequency information which limits their expressive ability. To overcome these limitations, we propose a multi-scale wavelet transformer framework for face forgery detection. Specifically, to take full advantage of the multi-scale and multi-frequency wavelet representation, we gradually aggregate the multi-scale wavelet representation at different stages of the backbone network. To better fuse the frequency feature with the spatial features, frequency-based spatial attention is designed to guide the spatial feature extractor to concentrate more on forgery traces. Meanwhile, cross-modality attention is proposed to fuse the frequency features with the spatial features. These two attention modules are calculated through a unified transformer block for efficiency. A wide variety of experiments demonstrate that the proposed method is efficient and effective for both within and cross datasets.
Comments: The first two authors contributed equally to this work. Accepted to ACCV 2022 as oral presentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2210.03899 [cs.CV]
  (or arXiv:2210.03899v1 [cs.CV] for this version)

Submission history

From: Jingjing Wang [view email]
[v1] Sat, 8 Oct 2022 03:39:36 GMT (3228kb,D)

Link back to: arXiv, form interface, contact.