One Model to Synthesize Them All: Multi-contrast Multi-scale Transformer for Missing Data Imputation

Liu, Jiang; Pasumarthi, Srivathsa; Duffy, Ben; Gong, Enhao; Datta, Keshav; Zaharchuk, Greg

doi:10.1109/TMI.2023.3261707

Full-text links:

Download:

Current browse context:

eess.IV

< prev | next >

new | recent | 2204

Electrical Engineering and Systems Science > Image and Video Processing

Title: One Model to Synthesize Them All: Multi-contrast Multi-scale Transformer for Missing Data Imputation

Authors: Jiang Liu, Srivathsa Pasumarthi, Ben Duffy, Enhao Gong, Keshav Datta, Greg Zaharchuk

(Submitted on 28 Apr 2022 (v1), last revised 29 Mar 2023 (this version, v3))

Abstract: Multi-contrast magnetic resonance imaging (MRI) is widely used in clinical practice as each contrast provides complementary information. However, the availability of each imaging contrast may vary amongst patients, which poses challenges to radiologists and automated image analysis algorithms. A general approach for tackling this problem is missing data imputation, which aims to synthesize the missing contrasts from existing ones. While several convolutional neural networks (CNN) based algorithms have been proposed, they suffer from the fundamental limitations of CNN models, such as the requirement for fixed numbers of input and output channels, the inability to capture long-range dependencies, and the lack of interpretability. In this work, we formulate missing data imputation as a sequence-to-sequence learning problem and propose a multi-contrast multi-scale Transformer (MMT), which can take any subset of input contrasts and synthesize those that are missing. MMT consists of a multi-scale Transformer encoder that builds hierarchical representations of inputs combined with a multi-scale Transformer decoder that generates the outputs in a coarse-to-fine fashion. The proposed multi-contrast Swin Transformer blocks can efficiently capture intra- and inter-contrast dependencies for accurate image synthesis. Moreover, MMT is inherently interpretable as it allows us to understand the importance of each input contrast in different regions by analyzing the in-built attention maps of Transformer blocks in the decoder. Extensive experiments on two large-scale multi-contrast MRI datasets demonstrate that MMT outperforms the state-of-the-art methods quantitatively and qualitatively.

Comments:	IEEE TMI accepted final version
Subjects:	Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
DOI:	10.1109/TMI.2023.3261707
Cite as:	arXiv:2204.13738 [eess.IV]
	(or arXiv:2204.13738v3 [eess.IV] for this version)

Submission history

From: Jiang Liu [view email]
[v1] Thu, 28 Apr 2022 18:49:27 GMT (4414kb,D)
[v2] Wed, 22 Mar 2023 18:30:11 GMT (7657kb,D)
[v3] Wed, 29 Mar 2023 19:05:39 GMT (7657kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2204.13738

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Image and Video Processing

Title: One Model to Synthesize Them All: Multi-contrast Multi-scale Transformer for Missing Data Imputation

Submission history