We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computer Vision and Pattern Recognition

Title: Controllable Video Captioning with an Exemplar Sentence

Abstract: In this paper, we investigate a novel and challenging task, namely controllable video captioning with an exemplar sentence. Formally, given a video and a syntactically valid exemplar sentence, the task aims to generate one caption which not only describes the semantic contents of the video, but also follows the syntactic form of the given exemplar sentence. In order to tackle such an exemplar-based video captioning task, we propose a novel Syntax Modulated Caption Generator (SMCG) incorporated in an encoder-decoder-reconstructor architecture. The proposed SMCG takes video semantic representation as an input, and conditionally modulates the gates and cells of long short-term memory network with respect to the encoded syntactic information of the given exemplar sentence. Therefore, SMCG is able to control the states for word prediction and achieve the syntax customized caption generation. We conduct experiments by collecting auxiliary exemplar sentences for two public video captioning datasets. Extensive experimental results demonstrate the effectiveness of our approach on generating syntax controllable and semantic preserved video captions. By providing different exemplar sentences, our approach is capable of producing different captions with various syntactic structures, thus indicating a promising way to strengthen the diversity of video captioning.
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
Journal reference: [C]//Proceedings of the 28th ACM International Conference on Multimedia. 2020: 1085-1093
DOI: 10.1145/3394171.3413908
Cite as: arXiv:2112.01073 [cs.CV]
  (or arXiv:2112.01073v1 [cs.CV] for this version)

Submission history

From: Yitian Yuan [view email]
[v1] Thu, 2 Dec 2021 09:24:45 GMT (2752kb,D)

Link back to: arXiv, form interface, contact.