Recommender system as an exploration coordinator: a bounded O(1) regret algorithm for large platforms

Kang, Hyunwook; Kumar, P. R.

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2301

Computer Science > Machine Learning

Title: Recommender system as an exploration coordinator: a bounded O(1) regret algorithm for large platforms

Authors: Hyunwook Kang, P. R. Kumar

(Submitted on 29 Jan 2023 (this version), latest version 29 Jun 2023 (v2))

Abstract: On typical modern platforms, users are only able to try a small fraction of the available items. This makes it difficult to model the exploration behavior of platform users as typical online learners who explore all the items. Towards addressing this issue, we propose to interpret a recommender system as a bandit exploration coordinator that provides counterfactual information updates. In particular, we introduce a novel algorithm called Counterfactual UCB (CFUCB) which is guarantees user exploration coordination with bounded regret under the presence of linear representations. Our results show that sharing information is a Subgame Perfect Nash Equilibrium for agents in terms of regret, leading to each agent achieving bounded regret. This approach has potential applications in personalized recommender systems and adaptive experimentation.

Subjects:	Machine Learning (cs.LG); Computers and Society (cs.CY); Information Retrieval (cs.IR); General Economics (econ.GN)
Cite as:	arXiv:2301.12571 [cs.LG]
	(or arXiv:2301.12571v1 [cs.LG] for this version)

Submission history

From: Enoch Hyunwook Kang [view email]
[v1] Sun, 29 Jan 2023 22:39:50 GMT (63kb,D)
[v2] Thu, 29 Jun 2023 22:13:51 GMT (1647kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2301.12571v1

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Recommender system as an exploration coordinator: a bounded O(1) regret algorithm for large platforms

Submission history