A Bandit Learning Algorithm and Applications to Auction Design

Authors: Kim Thang Nguyen

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we introduce a new notion of (λ, µ)-concave functions and present a bandit learning algorithm that achieves a performance guarantee which is characterized as a function of the concavity parameters λ and µ. The algorithm is based on the mirror descent algorithm in which the update directions follow the gradient of the multilinear extensions of the reward functions. The regret bound induced by our algorithm is e O(T) which is nearly optimal. We apply our algorithm to auction design, specifically to welfare maximization, revenue maximization, and no-envy learning in auctions. Using our framework, we prove the following result. Informal Theorem 3. Using our framework, we derive the following result. Informal Theorem 4.
Researcher Affiliation Academia Nguyễn Kim Thắng IBISC, Univ. Evry, University Paris-Saclay, France
Pseudocode Yes Algorithm 1 Algorithm in the bandit setting. 1: Let Φ be a ν-self-concordant function over [0, 1]n (M+1). 2: Let z1 int([0, 1]n (M+1)) such that Φ(z1) = 0. 3: for t = 1 to T do 4: Let At = 2Φ(zt) 1/2. 5: Pick ut Sn uniformly random and set yt = zt + Atut. 6: Round yt to a random point 1St {0, 1}n (M+1) such that element (i, j) appears in St with probability yt ij. 7: Play xt = m 1(1St) and receive the reward of f t(xt). 8: Let gt = n(M + 1)f t(xt)(At) 1ut and compute the solution zt+1 [0, 1]n (M+1) by applying the mirror descent framework on F t (Section 2.1). Specifically, zt+1 = arg max z [0,1]n (M+1) ηgt, z zt DΦ(z zt) .
Open Source Code No The paper does not provide any specific repository link or explicit statement about the availability of source code for the methodology described.
Open Datasets No This is a theoretical paper focusing on algorithm design and theoretical bounds; it does not involve empirical evaluation on datasets, hence no information on public dataset access is provided.
Dataset Splits No This is a theoretical paper focusing on algorithm design and theoretical bounds; it does not involve empirical evaluation on datasets, hence no information on dataset splits is provided.
Hardware Specification No This is a theoretical paper and does not describe the hardware used to run experiments.
Software Dependencies No This is a theoretical paper and does not list specific software dependencies with version numbers for experimental replication.
Experiment Setup No This is a theoretical paper and does not detail an experimental setup with specific hyperparameters or training configurations.