We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: Multimodal Fusion on Low-quality Data: A Comprehensive Survey

Abstract: Multimodal fusion focuses on integrating information from multiple modalities with the goal of more accurate prediction, which has achieved remarkable progress in a wide range of scenarios, including autonomous driving and medical diagnosis. However, the reliability of multimodal fusion remains largely unexplored especially under low-quality data settings. This paper surveys the common challenges and recent advances of multimodal fusion in the wild and presents them in a comprehensive taxonomy. From a data-centric view, we identify four main challenges that are faced by multimodal fusion on low-quality data, namely (1) noisy multimodal data that are contaminated with heterogeneous noises, (2) incomplete multimodal data that some modalities are missing, (3) imbalanced multimodal data that the qualities or properties of different modalities are significantly different and (4) quality-varying multimodal data that the quality of each modality dynamically changes with respect to different samples. This new taxonomy will enable researchers to understand the state of the field and identify several potential directions. We also provide discussion for the open problems in this field together with interesting future research directions.
Comments: Feel free to comment on our manuscript: qingyangzhang@tju.edu.cn
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as: arXiv:2404.18947 [cs.LG]
  (or arXiv:2404.18947v2 [cs.LG] for this version)

Submission history

From: Qingyang Zhang [view email]
[v1] Sat, 27 Apr 2024 07:22:28 GMT (3023kb,D)
[v2] Sun, 5 May 2024 08:29:35 GMT (3023kb,D)

Link back to: arXiv, form interface, contact.