G-Refine: A General Quality Refiner for Text-to-Image Generation

Li, Chunyi; Wu, Haoning; Hao, Hongkun; Zhang, Zicheng; Kou, Tengchaun; Chen, Chaofeng; Bai, Lei; Liu, Xiaohong; Lin, Weisi; Zhai, Guangtao

Full-text links:

Download:

Current browse context:

cs.MM

< prev | next >

new | recent | 2404

Computer Science > Multimedia

Title: G-Refine: A General Quality Refiner for Text-to-Image Generation

Authors: Chunyi Li, Haoning Wu, Hongkun Hao, Zicheng Zhang, Tengchaun Kou, Chaofeng Chen, Lei Bai, Xiaohong Liu, Weisi Lin, Guangtao Zhai

(Submitted on 29 Apr 2024)

Abstract: With the evolution of Text-to-Image (T2I) models, the quality defects of AI-Generated Images (AIGIs) pose a significant barrier to their widespread adoption. In terms of both perception and alignment, existing models cannot always guarantee high-quality results. To mitigate this limitation, we introduce G-Refine, a general image quality refiner designed to enhance low-quality images without compromising the integrity of high-quality ones. The model is composed of three interconnected modules: a perception quality indicator, an alignment quality indicator, and a general quality enhancement module. Based on the mechanisms of the Human Visual System (HVS) and syntax trees, the first two indicators can respectively identify the perception and alignment deficiencies, and the last module can apply targeted quality enhancement accordingly. Extensive experimentation reveals that when compared to alternative optimization methods, AIGIs after G-Refine outperform in 10+ quality metrics across 4 databases. This improvement significantly contributes to the practical application of contemporary T2I models, paving the way for their broader adoption. The code will be released on this https URL

Subjects:	Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2404.18343 [cs.MM]
	(or arXiv:2404.18343v1 [cs.MM] for this version)

Submission history

From: Chunyi Li [view email]
[v1] Mon, 29 Apr 2024 00:54:38 GMT (38548kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2404.18343

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Multimedia

Title: G-Refine: A General Quality Refiner for Text-to-Image Generation

Submission history