We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:


References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: A Self-Explainable Stylish Image Captioning Framework via Multi-References

Abstract: In this paper, we propose to build a stylish image captioning model through a Multi-style Multi modality mechanism (2M). We demonstrate that with 2M, we can build an effective stylish captioner and that multi-references produced by the model can also support explaining the model through identifying erroneous input features on faulty examples. We show how this 2M mechanism can be used to build stylish captioning models and show how these models can be utilized to provide explanations of likely errors in the models.
Comments: arXiv admin note: substantial text overlap with arXiv:2103.11186 This paper is under consideration at Computer Vision and Image Understanding
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2110.10704 [cs.CL]
  (or arXiv:2110.10704v2 [cs.CL] for this version)

Submission history

From: Brent Harrison [view email]
[v1] Wed, 20 Oct 2021 18:00:40 GMT (10328kb,D)
[v2] Thu, 18 Nov 2021 18:39:15 GMT (10328kb,D)

Link back to: arXiv, form interface, contact.