JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection

Ehsanpour, Mahsa; Saleh, Fatemeh; Savarese, Silvio; Reid, Ian; Rezatofighi, Hamid

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2106

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection

Authors: Mahsa Ehsanpour, Fatemeh Saleh, Silvio Savarese, Ian Reid, Hamid Rezatofighi

(Submitted on 16 Jun 2021 (v1), last revised 24 Nov 2021 (this version, v2))

Abstract: The availability of large-scale video action understanding datasets has facilitated advances in the interpretation of visual scenes containing people. However, learning to recognise human actions and their social interactions in an unconstrained real-world environment comprising numerous people, with potentially highly unbalanced and long-tailed distributed action labels from a stream of sensory data captured from a mobile robot platform remains a significant challenge, not least owing to the lack of a reflective large-scale dataset. In this paper, we introduce JRDB-Act, as an extension of the existing JRDB, which is captured by a social mobile manipulator and reflects a real distribution of human daily-life actions in a university campus environment. JRDB-Act has been densely annotated with atomic actions, comprises over 2.8M action labels, constituting a large-scale spatio-temporal action detection dataset. Each human bounding box is labeled with one pose-based action label and multiple~(optional) interaction-based action labels. Moreover JRDB-Act provides social group annotation, conducive to the task of grouping individuals based on their interactions in the scene to infer their social activities~(common activities in each social group). Each annotated label in JRDB-Act is tagged with the annotators' confidence level which contributes to the development of reliable evaluation strategies. In order to demonstrate how one can effectively utilise such annotations, we develop an end-to-end trainable pipeline to learn and infer these tasks, i.e. individual action and social group detection. The data and the evaluation code is publicly available at this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2106.08827 [cs.CV]
	(or arXiv:2106.08827v2 [cs.CV] for this version)

Submission history

From: Mahsa Ehsanpour [view email]
[v1] Wed, 16 Jun 2021 14:43:46 GMT (465kb,D)
[v2] Wed, 24 Nov 2021 04:40:27 GMT (5322kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2106.08827

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: JRDB-Act: A Large-scale Dataset for Spatio-temporal Action, Social Group and Activity Detection

Submission history