We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Diversity Actor-Critic: Sample-Aware Entropy Regularization for Sample-Efficient Exploration

Abstract: In this paper, sample-aware policy entropy regularization is proposed to enhance the conventional policy entropy regularization for better exploration. Exploiting the sample distribution obtainable from the replay buffer, the proposed sample-aware entropy regularization maximizes the entropy of the weighted sum of the policy action distribution and the sample action distribution from the replay buffer for sample-efficient exploration. A practical algorithm named diversity actor-critic (DAC) is developed by applying policy iteration to the objective function with the proposed sample-aware entropy regularization. Numerical results show that DAC significantly outperforms existing recent algorithms for reinforcement learning.
Comments: Accepted to Proceedings of the 38th International Conference on Machine Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as: arXiv:2006.01419 [cs.LG]
  (or arXiv:2006.01419v2 [cs.LG] for this version)

Submission history

From: Youngchul Sung [view email]
[v1] Tue, 2 Jun 2020 06:51:25 GMT (4314kb,D)
[v2] Wed, 9 Jun 2021 03:05:50 GMT (11733kb,D)

Link back to: arXiv, form interface, contact.