Vit-GAN: Image-to-image Translation with Vision Transformes and Conditional GANS

Gündüç, Yiğit

Full-text links:

Download:

Current browse context:

eess.IV

< prev | next >

new | recent | 2110

Electrical Engineering and Systems Science > Image and Video Processing

Title: Vit-GAN: Image-to-image Translation with Vision Transformes and Conditional GANS

Authors: Yiğit Gündüç

(Submitted on 11 Oct 2021)

Abstract: In this paper, we have developed a general-purpose architecture, Vit-Gan, capable of performing most of the image-to-image translation tasks from semantic image segmentation to single image depth perception. This paper is a follow-up paper, an extension of generator-based model [1] in which the obtained results were very promising. This opened the possibility of further improvements with adversarial architecture. We used a unique vision transformers-based generator architecture and Conditional GANs(cGANs) with a Markovian Discriminator (PatchGAN) (this https URL). In the present work, we use images as conditioning arguments. It is observed that the obtained results are more realistic than the commonly used architectures.

Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2110.09305 [eess.IV]
	(or arXiv:2110.09305v1 [eess.IV] for this version)

Submission history

From: Yigit Gunduc [view email]
[v1] Mon, 11 Oct 2021 18:09:16 GMT (12688kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2110.09305

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Image and Video Processing

Title: Vit-GAN: Image-to-image Translation with Vision Transformes and Conditional GANS

Submission history