Learning to Bid in Contextual First Price Auctions

Badanidiyuru, Ashwinkumar; Feng, Zhe; Guruganesh, Guru

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2109

Computer Science > Machine Learning

Title: Learning to Bid in Contextual First Price Auctions

Authors: Ashwinkumar Badanidiyuru, Zhe Feng, Guru Guruganesh

(Submitted on 7 Sep 2021 (v1), last revised 10 Nov 2021 (this version, v2))

Abstract: In this paper, we investigate the problem about how to bid in repeated contextual first price auctions. We consider a single bidder (learner) who repeatedly bids in the first price auctions: at each time $t$, the learner observes a context $x_t\in \mathbb{R}^d$ and decides the bid based on historical information and $x_t$. We assume a structured linear model of the maximum bid of all the others $m_t = \alpha_0\cdot x_t + z_t$, where $\alpha_0\in \mathbb{R}^d$ is unknown to the learner and $z_t$ is randomly sampled from a noise distribution $\mathcal{F}$ with log-concave density function $f$. We consider both \emph{binary feedback} (the learner can only observe whether she wins or not) and \emph{full information feedback} (the learner can observe $m_t$) at the end of each time $t$. For binary feedback, when the noise distribution $\mathcal{F}$ is known, we propose a bidding algorithm, by using maximum likelihood estimation (MLE) method to achieve at most $\widetilde{O}(\sqrt{\log(d) T})$ regret. Moreover, we generalize this algorithm to the setting with binary feedback and the noise distribution is unknown but belongs to a parametrized family of distributions. For the full information feedback with \emph{unknown} noise distribution, we provide an algorithm that achieves regret at most $\widetilde{O}(\sqrt{dT})$. Our approach combines an estimator for log-concave density functions and then MLE method to learn the noise distribution $\mathcal{F}$ and linear weight $\alpha_0$ simultaneously. We also provide a lower bound result such that any bidding policy in a broad class must achieve regret at least $\Omega(\sqrt{T})$, even when the learner receives the full information feedback and $\mathcal{F}$ is known.

Subjects:	Machine Learning (cs.LG); Computer Science and Game Theory (cs.GT)
Cite as:	arXiv:2109.03173 [cs.LG]
	(or arXiv:2109.03173v2 [cs.LG] for this version)

Submission history

From: Zhe Feng [view email]
[v1] Tue, 7 Sep 2021 16:11:18 GMT (32kb)
[v2] Wed, 10 Nov 2021 05:05:43 GMT (423kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2109.03173

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Learning to Bid in Contextual First Price Auctions

Submission history