Contextual Bandits with Online Neural Regression

Authors: Rohan Deb, Yikun Ban, Shiliang Zuo, Jingrui He, Arindam Banerjee

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, our experimental results on various datasets demonstrate that our algorithms, especially the one based on KL loss, persistently outperform existing algorithms. Finally, in Section 5 we compare our algorithms against baseline algorithms for Neu CBs.
Researcher Affiliation Academia Rohan Deb, Yikun Ban, Shiliang Zhou, Jingrui He, & Arindam Banerjee University of Illinois, Urbana-Champaign {rd22,yikunb2,szuo3,jingrui,arindamb}@cs.illinois.edu
Pseudocode Yes Algorithm 1 Neural Square CB (Neu Square CB); Uses Square loss. Algorithm 2 Neural Fast CB (Neu Fast CB); Uses KL loss.
Open Source Code No The paper does not provide a specific link or explicit statement about releasing the source code for their proposed methods. A link to a baseline's code is provided, but not their own.
Open Datasets Yes We consider a collection of 6 multiclass classification datasets from the openml.org platform: covertype, fashion, Magic Telescope, mushroom, Plants and shuttle.
Dataset Splits No The paper mentions datasets and a 'standard evaluation strategy' but does not provide specific details on train/validation/test splits, percentages, or sample counts for the datasets used in their experiments.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, cloud platforms, memory) used to conduct the experiments.
Software Dependencies No The paper mentions using neural networks and various models but does not specify the versions of any software libraries, frameworks (like PyTorch or TensorFlow), or programming languages used.
Experiment Setup Yes Both Neu Square CB and Neu Fast CB use a 2-layered Re Lu network with 100 hidden neurons. The last layer in Neu RIG uses a linear activation while Neu Fast CB uses a sigmoid. We perform a grid-search over the regularization parameter λ over (1, 0.1, 0.01) and the exploration parameter ν over (0.1, 0.01, 0.001). Neural Epsilon uses the same neural architecture and the exploration parameter ϵ is searched over (0.1, 0.05, 0.01). For all the algorithms we also do a grid-search for the step-size over (0.01, 0.005, 0.001).