Unsupervised Semantic Parsing of Video Collections

Sener, Ozan; Zamir, Amir; Savarese, Silvio; Saxena, Ashutosh

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 1506

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Unsupervised Semantic Parsing of Video Collections

Authors: Ozan Sener, Amir Zamir, Silvio Savarese, Ashutosh Saxena

(Submitted on 28 Jun 2015 (v1), revised 10 Aug 2015 (this version, v3), latest version 27 Jan 2016 (v4))

Abstract: Human communication typically has an underlying structure. This is reflected in the fact that in many user generated videos, a starting point, ending, and certain objective steps between these two can be identified. In this paper, we propose a method for parsing a video into such semantic steps in an unsupervised way. The proposed method is capable of providing a semantic "storyline" of the video composed of its objective steps. We accomplish this using both visual and language cues in a joint generative model. The proposed method can also provide a textual description for each of the identified semantic steps and video segments. We evaluate this method on a large number of complex YouTube videos and show results of unprecedented quality for this intricate and impactful problem.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1506.08438 [cs.CV]
	(or arXiv:1506.08438v3 [cs.CV] for this version)

Submission history

From: Ozan Sener [view email]
[v1] Sun, 28 Jun 2015 19:16:38 GMT (8986kb,D)
[v2] Tue, 7 Jul 2015 23:45:17 GMT (8986kb,D)
[v3] Mon, 10 Aug 2015 23:57:10 GMT (8986kb,D)
[v4] Wed, 27 Jan 2016 12:54:15 GMT (8685kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1506.08438v3

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Unsupervised Semantic Parsing of Video Collections

Submission history