Parametric Graph for Unimodal Ranking Bandit
Authors: Camille-Sovanneary Gauthier, Romaric Gaudel, Elisa Fromont, Boammani Aser Lompo
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments against state-of-the-art learning algorithms which also tackle the PBM setting, show that our method is more efficient while giving regret performance on par with the best known algorithms on simulated and real life datasets. |
| Researcher Affiliation | Collaboration | 1Louis Vuitton, F-75001 Paris, France 2IRISA UMR 6074 / INRIA rba, F-35000 Rennes, France 3Univ Rennes, Ensai, CNRS, CREST UMR 9194, F-35000 Rennes, France 4Univ. Rennes 1, F-35000 Rennes, France 5Institut Universitaire de France, M.E.S.R.I., F-75231 Paris 6ENS Rennes, F-35000 Rennes, France. |
| Pseudocode | Yes | Algorithm 1 GRAB: parametric Graph for unimodal RAnking Bandit |
| Open Source Code | Yes | Code and data for replicating our experiments are available at https: //github.com/gaudel/ranking_bandits. |
| Open Datasets | Yes | The experiments are conducted on the Yandex dataset (Yandex, 2013) and on purely simulated data. |
| Dataset Splits | No | The paper describes the online learning setting where data is continuously generated or interacted with, rather than using pre-defined train/validation/test dataset splits. It mentions '20 independent runs' for averaging results but not for data partitioning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions the 'Pyclick library' but does not provide specific version numbers for any software dependencies or programming languages used. |
| Experiment Setup | Yes | We consider L = 10 items, K = 5 positions, and κ = [1, 0.75, 0.6, 0.3, 0.1]. The range of values for θ is either close to zero (θ = [10 3, 5.10 4, 10 4, 5.10 5, 10 5, 10 6, . . . , 10 6]), or close to one (θ+ = [0.99, 0.95, 0.9, 0.85, 0.8, 0.75, . . . , 0.75]). We also carefully tune the exploration hyper-parameter c of εn-greedy taking values ranging exponentially from 100 to 106. For PB-MHB, we use the hyper-parameters recommended in (Gauthier et al., 2021). |