Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Robust Multi-Agent Multi-Armed Bandits
(Submitted on 7 Jul 2020 (v1), last revised 10 Oct 2021 (this version, v3))
Abstract: Recent works have shown that agents facing independent instances of a stochastic $K$-armed bandit can collaborate to decrease regret. However, these works assume that each agent always recommends their individual best-arm estimates to other agents, which is unrealistic in envisioned applications (machine faults in distributed computing or spam in social recommendation systems). Hence, we generalize the setting to include $n$ honest and $m$ malicious agents who recommend best-arm estimates and arbitrary arms, respectively. We first show that even with a single malicious agent, existing collaboration-based algorithms fail to improve regret guarantees over a single-agent baseline. We propose a scheme where honest agents learn who is malicious and dynamically reduce communication with (i.e., "block") them. We show that collaboration indeed decreases regret for this algorithm, assuming $m$ is small compared to $K$ but without assumptions on malicious agents' behavior, thus ensuring that our algorithm is robust against any malicious recommendation strategy.
Submission history
From: Daniel Vial [view email][v1] Tue, 7 Jul 2020 22:27:30 GMT (911kb,D)
[v2] Fri, 18 Dec 2020 22:31:09 GMT (1508kb,D)
[v3] Sun, 10 Oct 2021 14:25:23 GMT (1509kb,D)
Link back to: arXiv, form interface, contact.