Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection

Zhang, Zhaoxiang; Deng, Hanqiu; Bao, Jinan; Li, Xingyu

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2405

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection

Authors: Zhaoxiang Zhang, Hanqiu Deng, Jinan Bao, Xingyu Li

(Submitted on 8 May 2024)

Abstract: Image Anomaly Detection has been a challenging task in Computer Vision field. The advent of Vision-Language models, particularly the rise of CLIP-based frameworks, has opened new avenues for zero-shot anomaly detection. Recent studies have explored the use of CLIP by aligning images with normal and prompt descriptions. However, the exclusive dependence on textual guidance often falls short, highlighting the critical importance of additional visual references. In this work, we introduce a Dual-Image Enhanced CLIP approach, leveraging a joint vision-language scoring system. Our methods process pairs of images, utilizing each as a visual reference for the other, thereby enriching the inference process with visual context. This dual-image strategy markedly enhanced both anomaly classification and localization performances. Furthermore, we have strengthened our model with a test-time adaptation module that incorporates synthesized anomalies to refine localization capabilities. Our approach significantly exploits the potential of vision-language joint anomaly detection and demonstrates comparable performance with current SOTA methods across various datasets.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2405.04782 [cs.CV]
	(or arXiv:2405.04782v1 [cs.CV] for this version)

Submission history

From: Zhaoxiang Zhang [view email]
[v1] Wed, 8 May 2024 03:13:20 GMT (45784kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2405.04782

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection

Submission history