reproducibilityindex.ai

Optimizing the coalition gain in Online Auctions with Greedy Structured Bandits

Authors: Dorian Baudry, Hugo Richard, Maria Cherifa, Vianney Perchet, Clément Calauzènes

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In Appendix D we present a benchmark of LG, GG, UCB, EXP3 and OSUB on synthetic data in terms of the expected regret R(T). This benchmark illustrates the strong performance of LG relative to the other approaches.
Researcher Affiliation	Collaboration	Dorian Baudry1,2, Hugo Richard2, Maria Cherifa2, Clément Calauzènes2 Vianney Perchet2 1 Department of Statistics, University of Oxford. 2 Inria Fairplay Joint team, CREST, ENSAE Paris, Criteo AI Lab
Pseudocode	Yes	Algorithm 1 Local Greedy (LG) Input: exploration parameter α, neighborhoods (V(n))n [N] (Definition 1)
Open Source Code	Yes	yes the paper provide open access to the code (as a zip folder in the supplementary materials) with all the instructions needed to reproduce it .
Open Datasets	No	The paper uses "synthetic data" generated from specified distributions (e.g., B(0.05), Beta(a, b), Trunc. Exp(0,1)) for its simulations, rather than a pre-existing publicly available dataset. Therefore, no access information for a public dataset is provided or applicable.
Dataset Splits	No	The paper describes generating synthetic data and running simulations to compute regret. It does not mention any explicit training, validation, or test splits for the data, which is typical for bandit algorithm evaluations where data is simulated online.
Hardware Specification	Yes	All the experiments were conducted on a single standard laptop, with an execution time shorter than 24 hours.
Software Dependencies	No	All the code for the experiments is written in Python. We use Matplotlib for plotting [18], Numpy [15] for numerical computing and Scipy [31] for scientific and technical computing. While specific libraries are mentioned and cited, their version numbers are not provided, which is necessary for reproducibility.
Experiment Setup	Yes	In a first simulation, we consider a coalition of size N = 100 and a competition of size p = 4. At each timestep t, the algorithm decide a number of bidders nt to send to the auction and the values of all bidders (coalition and competition) v {0, 1}nt+p are sampled according to B(0.05). With probability nt nt+p, the reward v(1) v(2) is received and v(1) is observed. The (pseudo) regret at time t is then computed as the sum of reward obtained up to time t. The simulation above is repeated 20 times with random seeds and the mean value across seeds is reported as the expected regret R(t) in Figure 3.