We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Borrowing Human Senses: Comment-Aware Self-Training for Social Media Multimodal Classification

Authors: Chunpu Xu, Jing Li
Abstract: Social media is daily creating massive multimedia content with paired image and text, presenting the pressing need to automate the vision and language understanding for various multimodal classification tasks. Compared to the commonly researched visual-lingual data, social media posts tend to exhibit more implicit image-text relations. To better glue the cross-modal semantics therein, we capture hinting features from user comments, which are retrieved via jointly leveraging visual and lingual similarity. Afterwards, the classification tasks are explored via self-training in a teacher-student framework, motivated by the usually limited labeled data scales in existing benchmarks. Substantial experiments are conducted on four multimodal social media benchmarks for image text relation classification, sarcasm detection, sentiment classification, and hate speech detection. The results show that our method further advances the performance of previous state-of-the-art models, which do not employ comment modeling or self-training.
Comments: accepted to EMNLP 2022
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Multimedia (cs.MM)
Cite as: arXiv:2303.15016 [cs.CL]
  (or arXiv:2303.15016v1 [cs.CL] for this version)

Submission history

From: Chunpu Xu [view email]
[v1] Mon, 27 Mar 2023 08:59:55 GMT (2356kb,D)

Link back to: arXiv, form interface, contact.