We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Shape-Guided Diffusion with Inside-Outside Attention

Abstract: We introduce precise object silhouette as a new form of user control in text-to-image diffusion models, which we dub Shape-Guided Diffusion. Our training-free method uses an Inside-Outside Attention mechanism during the inversion and generation process to apply a shape constraint to the cross- and self-attention maps. Our mechanism designates which spatial region is the object (inside) vs. background (outside) then associates edits to the correct region. We demonstrate the efficacy of our method on the shape-guided editing task, where the model must replace an object according to a text prompt and object mask. We curate a new ShapePrompts benchmark derived from MS-COCO and achieve SOTA results in shape faithfulness without a degradation in text alignment or image realism according to both automatic metrics and annotator ratings. Our data and code will be made available at this https URL
Comments: WACV 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as: arXiv:2212.00210 [cs.CV]
  (or arXiv:2212.00210v3 [cs.CV] for this version)

Submission history

From: Grace Luo [view email]
[v1] Thu, 1 Dec 2022 01:39:28 GMT (44457kb,D)
[v2] Wed, 22 Mar 2023 08:58:15 GMT (45378kb,D)
[v3] Mon, 1 Apr 2024 17:19:02 GMT (22505kb,D)

Link back to: arXiv, form interface, contact.