Multimodal Fusion Using Deep Learning Applied to Driver's Referencing of Outside-Vehicle Objects

Aftab, Abdul Rafey; von der Beeck, Michael; Rohrhirsch, Steven; Diotte, Benoit; Feld, Michael

Full-text links:

Download:

Current browse context:

cs.HC

< prev | next >

new | recent | 2107

Computer Science > Human-Computer Interaction

Title: Multimodal Fusion Using Deep Learning Applied to Driver's Referencing of Outside-Vehicle Objects

Authors: Abdul Rafey Aftab, Michael von der Beeck, Steven Rohrhirsch, Benoit Diotte, Michael Feld

(Submitted on 26 Jul 2021)

Abstract: There is a growing interest in more intelligent natural user interaction with the car. Hand gestures and speech are already being applied for driver-car interaction. Moreover, multimodal approaches are also showing promise in the automotive industry. In this paper, we utilize deep learning for a multimodal fusion network for referencing objects outside the vehicle. We use features from gaze, head pose and finger pointing simultaneously to precisely predict the referenced objects in different car poses. We demonstrate the practical limitations of each modality when used for a natural form of referencing, specifically inside the car. As evident from our results, we overcome the modality specific limitations, to a large extent, by the addition of other modalities. This work highlights the importance of multimodal sensing, especially when moving towards natural user interaction. Furthermore, our user based analysis shows noteworthy differences in recognition of user behavior depending upon the vehicle pose.

Subjects:	Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Cite as:	arXiv:2107.12167 [cs.HC]
	(or arXiv:2107.12167v1 [cs.HC] for this version)

Submission history

From: Abdul Rafey Aftab [view email]
[v1] Mon, 26 Jul 2021 12:37:06 GMT (1683kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2107.12167

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Human-Computer Interaction

Title: Multimodal Fusion Using Deep Learning Applied to Driver's Referencing of Outside-Vehicle Objects

Submission history