We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Self-Supervised Training with Autoencoders for Visual Anomaly Detection

Abstract: Recently, deep auto-encoders have been used for the task of anomaly detection in the visual domain. By optimising for the reconstruction error using anomaly-free examples, the common belief is that a corresponding network should fail to accurately reconstruct anomalous regions in the application phase. This goal is typically addressed by controlling the capacity of the network, either by reducing the size of the bottleneck layer or by enforcing sparsity constraints on its activations. However, neither of these techniques does explicitly penalise reconstruction of anomalous signals often resulting in poor detection. We tackle this problem by adapting a self-supervised learning regime that allows the use of discriminative information during training but focuses on the data manifold of normal examples. Precisely, we investigate two different training objectives inspired by the task of neural image inpainting. Our main objective regularises the model to produce locally consistent reconstructions, while replacing irregularities, therefore, acting as a filter that removes anomalous patterns. Our formal analysis shows that under mild conditions the corresponding model resembles a non-linear orthogonal projection of partially corrupted images onto the manifold of uncorrupted (defect-free) examples. This insight makes the reconstruction error a natural choice for defining the anomaly score of a sample according to its distance from a corresponding projection on the data manifold. We emphasise that inference with our approach is very efficient during training and prediction requiring a single forward pass for each input image. Our experiments on the MVTec AD dataset demonstrate high detection and localisation performance. On the texture-subset, in particular, our approach consistently outperforms recent anomaly detection methods by a significant margin.
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as: arXiv:2206.11723 [cs.CV]
  (or arXiv:2206.11723v7 [cs.CV] for this version)

Submission history

From: Alexander Bauer [view email]
[v1] Thu, 23 Jun 2022 14:16:30 GMT (30281kb,D)
[v2] Tue, 28 Jun 2022 10:56:48 GMT (30125kb,D)
[v3] Wed, 14 Jun 2023 23:33:53 GMT (36906kb,D)
[v4] Thu, 24 Aug 2023 11:35:01 GMT (38546kb,D)
[v5] Fri, 22 Sep 2023 14:57:23 GMT (38702kb,D)
[v6] Mon, 9 Oct 2023 15:18:32 GMT (38702kb,D)
[v7] Wed, 24 Jan 2024 20:36:29 GMT (38693kb,D)

Link back to: arXiv, form interface, contact.