Online Learning of Delayed Choices

Authors: Recep Yusuf Bekci

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments that confirm the effectiveness of our algorithm. 7 Experiments: We conducted two sets of experiments to evaluate the performance of our algorithms.
Researcher Affiliation Academia Recep Yusuf Bekci University of Waterloo Waterloo, Canada recep.bekci@uwaterloo.ca
Pseudocode Yes Algorithm 1 Delayed MNL Bandit (DEMBA)
Open Source Code Yes The necessary information is provided in the Experiments section and in the appendix as well as the scripts for experiments are provided.
Open Datasets No The paper uses a synthetic dataset generated based on defined parameters (N=10, K=4, pi=1) and attraction parameters, rather than an existing public dataset. No access information is provided for this generated data.
Dataset Splits No The paper uses a simulation-based approach with synthetic data and measures cumulative regret over rounds, thus standard training, validation, and test dataset splits are not applicable or mentioned.
Hardware Specification Yes The simulations were conducted on a server equipped with 4 Intel Xeon 6248 2.5GHz CPUs and 377 GB of RAM, running Cent OS 7.
Software Dependencies Yes The simulation code was developed in Python version 3.9.6.
Experiment Setup Yes We used N = 10, K = 4 and pi = 1 for all i {1, . . . , N}. The attraction parameters were set as: vi = 0.25 + ϵ if i {1, 2, 9, 10} 0.25 otherwise, where ϵ represents the contrast between products. We used geometric delays with E[ds] = 100 and µ = 100 for the first experiment and E[ds] = 100 for the second experiment.