We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

Abstract: Off-policy deep reinforcement learning (RL) has been successful in a range of challenging domains. However, standard off-policy RL algorithms can suffer from several issues, such as instability in Q-learning and balancing exploration and exploitation. To mitigate these issues, we present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy RL algorithms. SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration. By enforcing the diversity between agents using Bootstrap with random initialization, we show that these different ideas are largely orthogonal and can be fruitfully integrated, together further improving the performance of existing off-policy RL algorithms, such as Soft Actor-Critic and Rainbow DQN, for both continuous and discrete control tasks on both low-dimensional and high-dimensional environments. Our training code is available at this https URL
Comments: ICML 2021 camera ready
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as: arXiv:2007.04938 [cs.LG]
  (or arXiv:2007.04938v4 [cs.LG] for this version)

Submission history

From: Kimin Lee [view email]
[v1] Thu, 9 Jul 2020 17:08:44 GMT (1160kb,D)
[v2] Tue, 21 Jul 2020 20:10:34 GMT (1193kb,D)
[v3] Wed, 9 Jun 2021 22:27:09 GMT (2893kb,D)
[v4] Fri, 11 Jun 2021 21:00:13 GMT (2893kb,D)

Link back to: arXiv, form interface, contact.