Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks

Authors: Rong Zhu, Mattia Rigotti

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We confirm empirically our theory by showing that SAU-based exploration outperforms current state-of-the-art deep Bayesian bandit methods on several real-world datasets at modest computation cost
Researcher Affiliation Collaboration Rong Zhu Institute of Science and Technology for Brain-inspired Intelligence, Fudan University rongzhu@fudan.edu.cn Mattia Rigotti IBM Research AI mr2666@columbia.edu
Pseudocode Yes Algorithm 1 SAU-UCB and SAU-Sampling for bandit problems
Open Source Code Yes and make the code to reproduce our results available at https://github.com/ibm/sau-explore.
Open Datasets Yes Empirical evaluation on real-world Deep Contextual Bandit problems. Table 1 quantifies the performance of SAU-Sampling and SAU-UCB in comparison to the 4 competing baseline algorithms... These results show that a SAU algorithm is the best algorithm in each of the 7 benchmarks... (Mushroom, Statlog, Covertype, Financial, Jester, Adult, Census).
Dataset Splits No No explicit split percentages or sample counts for training, validation, and test sets are provided in the paper. It refers to benchmarks from a previous paper [18] without detailing the splits.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory specifications, or cloud instance types) used for running experiments are mentioned in the paper.
Software Dependencies No No specific software dependencies or library versions (e.g., Python, PyTorch, TensorFlow, or specific solver versions) are provided in the paper.
Experiment Setup No No specific experimental setup details such as hyperparameter values (learning rate, batch size, number of epochs) or optimizer settings are explicitly provided in the paper's main text.