Shape-Guided Diffusion with Inside-Outside Attention

Park, Dong Huk; Luo, Grace; Toste, Clayton; Azadi, Samaneh; Liu, Xihui; Karalashvili, Maka; Rohrbach, Anna; Darrell, Trevor

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2212

Computer Science > Computer Vision and Pattern Recognition

Title: Shape-Guided Diffusion with Inside-Outside Attention

Authors: Dong Huk Park, Grace Luo, Clayton Toste, Samaneh Azadi, Xihui Liu, Maka Karalashvili, Anna Rohrbach, Trevor Darrell

(Submitted on 1 Dec 2022 (v1), last revised 1 Apr 2024 (this version, v3))

Abstract: We introduce precise object silhouette as a new form of user control in text-to-image diffusion models, which we dub Shape-Guided Diffusion. Our training-free method uses an Inside-Outside Attention mechanism during the inversion and generation process to apply a shape constraint to the cross- and self-attention maps. Our mechanism designates which spatial region is the object (inside) vs. background (outside) then associates edits to the correct region. We demonstrate the efficacy of our method on the shape-guided editing task, where the model must replace an object according to a text prompt and object mask. We curate a new ShapePrompts benchmark derived from MS-COCO and achieve SOTA results in shape faithfulness without a degradation in text alignment or image realism according to both automatic metrics and annotator ratings. Our data and code will be made available at this https URL

Comments:	WACV 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2212.00210 [cs.CV]
	(or arXiv:2212.00210v3 [cs.CV] for this version)

Submission history

From: Grace Luo [view email]
[v1] Thu, 1 Dec 2022 01:39:28 GMT (44457kb,D)
[v2] Wed, 22 Mar 2023 08:58:15 GMT (45378kb,D)
[v3] Mon, 1 Apr 2024 17:19:02 GMT (22505kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2212.00210

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Shape-Guided Diffusion with Inside-Outside Attention

Submission history