We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: From Coarse to Fine-grained Concept based Discrimination for Phrase Detection

Abstract: Phrase detection requires methods to identify if a phrase is relevant to an image and localize it, if applicable. A key challenge for training more discriminative detection models is sampling negatives. Sampling techniques from prior work focus primarily on hard, often noisy, negatives disregarding the broader distribution of negative samples. Our proposed CFCD-Net addresses this through two novels methods. First, we generate groups of semantically similar words we call concepts (\eg, \{dog, cat, horse\} and \ \{car, truck, SUV\}), and then train our CFCD-Net to discriminate between a region of interest and its unrelated concepts. Second, for phrases containing fine-grained mutually-exclusive words (\eg, colors), we force the model to select only one applicable phrase for each region using our novel fine-grained module (FGM). We evaluate our approach on Flickr30K Entities and RefCOCO+, where we improve mAP over the state-of-the-art by 1.5-2 points. When considering only the phrases affected by our FGM module, we improve by 3-4 points on both datasets.
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2112.03237 [cs.CV]
  (or arXiv:2112.03237v5 [cs.CV] for this version)

Submission history

From: Maan Qraitem [view email]
[v1] Mon, 6 Dec 2021 18:46:20 GMT (36348kb,D)
[v2] Thu, 29 Sep 2022 05:10:55 GMT (7725kb,D)
[v3] Fri, 30 Sep 2022 17:20:07 GMT (7725kb,D)
[v4] Mon, 3 Oct 2022 14:30:12 GMT (7725kb,D)
[v5] Tue, 15 Nov 2022 04:27:09 GMT (7588kb,D)

Link back to: arXiv, form interface, contact.