Sequential Dynamic Decision Making with Deep Neural Nets on a Test-Time Budget

Zhu, Henghui; Nan, Feng; Paschalidis, Ioannis; Saligrama, Venkatesh

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 1705

Statistics > Machine Learning

Title: Sequential Dynamic Decision Making with Deep Neural Nets on a Test-Time Budget

Authors: Henghui Zhu, Feng Nan, Ioannis Paschalidis, Venkatesh Saligrama

(Submitted on 31 May 2017)

Abstract: Deep neural network (DNN) based approaches hold significant potential for reinforcement learning (RL) and have already shown remarkable gains over state-of-art methods in a number of applications. The effectiveness of DNN methods can be attributed to leveraging the abundance of supervised data to learn value functions, Q-functions, and policy function approximations without the need for feature engineering. Nevertheless, the deployment of DNN-based predictors with very deep architectures can pose an issue due to computational and other resource constraints at test-time in a number of applications. We propose a novel approach for reducing the average latency by learning a computationally efficient gating function that is capable of recognizing states in a sequential decision process for which policy prescriptions of a shallow network suffices and deeper layers of the DNN have little marginal utility. The overall system is adaptive in that it dynamically switches control actions based on state-estimates in order to reduce average latency without sacrificing terminal performance. We experiment with a number of alternative loss-functions to train gating functions and shallow policies and show that in a number of applications a speed-up of up to almost 5X can be obtained with little loss in performance.

Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1705.10924 [stat.ML]
	(or arXiv:1705.10924v1 [stat.ML] for this version)

Submission history

From: Feng Nan [view email]
[v1] Wed, 31 May 2017 02:45:55 GMT (1059kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1705.10924

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Sequential Dynamic Decision Making with Deep Neural Nets on a Test-Time Budget

Submission history