Transformer-based Image Compression

Lu, Ming; Guo, Peiyao; Shi, Huiqing; Cao, Chuntong; Ma, Zhan

Full-text links:

Download:

Current browse context:

eess.IV

< prev | next >

new | recent | 2111

Electrical Engineering and Systems Science > Image and Video Processing

Title: Transformer-based Image Compression

Authors: Ming Lu, Peiyao Guo, Huiqing Shi, Chuntong Cao, Zhan Ma

(Submitted on 12 Nov 2021)

Abstract: A Transformer-based Image Compression (TIC) approach is developed which reuses the canonical variational autoencoder (VAE) architecture with paired main and hyper encoder-decoders. Both main and hyper encoders are comprised of a sequence of neural transformation units (NTUs) to analyse and aggregate important information for more compact representation of input image, while the decoders mirror the encoder-side operations to generate pixel-domain image reconstruction from the compressed bitstream. Each NTU is consist of a Swin Transformer Block (STB) and a convolutional layer (Conv) to best embed both long-range and short-range information; In the meantime, a casual attention module (CAM) is devised for adaptive context modeling of latent features to utilize both hyper and autoregressive priors. The TIC rivals with state-of-the-art approaches including deep convolutional neural networks (CNNs) based learnt image coding (LIC) methods and handcrafted rules-based intra profile of recently-approved Versatile Video Coding (VVC) standard, and requires much less model parameters, e.g., up to 45% reduction to leading-performance LIC.

Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2111.06707 [eess.IV]
	(or arXiv:2111.06707v1 [eess.IV] for this version)

Submission history

From: Ming Lu [view email]
[v1] Fri, 12 Nov 2021 13:13:20 GMT (4571kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2111.06707

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Image and Video Processing

Title: Transformer-based Image Compression

Submission history