References & Citations
Computer Science > Computation and Language
Title: Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text
(Submitted on 6 Apr 2016 (v1), last revised 29 Nov 2016 (this version, v2))
Abstract: This paper investigates how linguistic knowledge mined from large text corpora can aid the generation of natural language descriptions of videos. Specifically, we integrate both a neural language model and distributional semantics trained on large text corpora into a recent LSTM-based architecture for video description. We evaluate our approach on a collection of Youtube videos as well as two large movie description datasets showing significant improvements in grammaticality while modestly improving descriptive quality.
Submission history
From: Subhashini Venugopalan [view email][v1] Wed, 6 Apr 2016 19:01:28 GMT (708kb,D)
[v2] Tue, 29 Nov 2016 20:37:42 GMT (1584kb,D)
Link back to: arXiv, form interface, contact.