IntFormer: Predicting pedestrian intention with the aid of the Transformer architecture

Lorenzo, J.; Parra, I.; Sotelo, M. A.

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2105

Computer Science > Computer Vision and Pattern Recognition

Title: IntFormer: Predicting pedestrian intention with the aid of the Transformer architecture

Authors: J. Lorenzo, I. Parra, M. A. Sotelo

(Submitted on 18 May 2021)

Abstract: Understanding pedestrian crossing behavior is an essential goal in intelligent vehicle development, leading to an improvement in their security and traffic flow. In this paper, we developed a method called IntFormer. It is based on transformer architecture and a novel convolutional video classification model called RubiksNet. Following the evaluation procedure in a recent benchmark, we show that our model reaches state-of-the-art results with good performance ($\approx 40$ seq. per second) and size ($8\times $smaller than the best performing model), making it suitable for real-time usage. We also explore each of the input features, finding that ego-vehicle speed is the most important variable, possibly due to the similarity in crossing cases in PIE dataset.

Comments:	5 pages, 2 figures
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2105.08647 [cs.CV]
	(or arXiv:2105.08647v1 [cs.CV] for this version)

Submission history

From: Javier Lorenzo Díaz [view email]
[v1] Tue, 18 May 2021 16:23:15 GMT (1353kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2105.08647

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: IntFormer: Predicting pedestrian intention with the aid of the Transformer architecture

Submission history