WITT: A Wireless Image Transmission Transformer for Semantic Communications

Yang, Ke; Wang, Sixian; Dai, Jincheng; Tan, Kailin; Niu, Kai; Zhang, Ping

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2211

Computer Science > Computer Vision and Pattern Recognition

Title: WITT: A Wireless Image Transmission Transformer for Semantic Communications

Authors: Ke Yang, Sixian Wang, Jincheng Dai, Kailin Tan, Kai Niu, Ping Zhang

(Submitted on 2 Nov 2022)

Abstract: In this paper, we aim to redesign the vision Transformer (ViT) as a new backbone to realize semantic image transmission, termed wireless image transmission transformer (WITT). Previous works build upon convolutional neural networks (CNNs), which are inefficient in capturing global dependencies, resulting in degraded end-to-end transmission performance especially for high-resolution images. To tackle this, the proposed WITT employs Swin Transformers as a more capable backbone to extract long-range information. Different from ViTs in image classification tasks, WITT is highly optimized for image transmission while considering the effect of the wireless channel. Specifically, we propose a spatial modulation module to scale the latent representations according to channel state information, which enhances the ability of a single model to deal with various channel conditions. As a result, extensive experiments verify that our WITT attains better performance for different image resolutions, distortion metrics, and channel conditions. The code is available at this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
Cite as:	arXiv:2211.00937 [cs.CV]
	(or arXiv:2211.00937v1 [cs.CV] for this version)

Submission history

From: Jincheng Dai [view email]
[v1] Wed, 2 Nov 2022 07:50:27 GMT (7636kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2211.00937

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: WITT: A Wireless Image Transmission Transformer for Semantic Communications

Submission history