Infinite Action Contextual Bandits with Reusable Data Exhaust
Authors: Mark Rucker, Yinglun Zhu, Paul Mineiro
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct multiple experiments in this section. In Section 4.1, we empirically compare the performance of Theorem 2 and Algorithm 3. We compare our algorithm Capped IGW with the previous state-of-the-art algorithm Smooth IGW (Zhu & Mineiro, 2022) in terms of both the online performance (Section 4.2) and the offline utility (Section 4.3). We also demonstrate why Smooth IGW lacks offline utility in Section 4.4. Code to reproduce all experiments available at https://github.com/mrucker/onoff_experiments. |
| Researcher Affiliation | Collaboration | Mark Rucker 1 Yinglun Zhu 2 Paul Mineiro 3 1University of Virginia 2University of Wisconsin-Madison 3Microsoft Research NYC. |
| Pseudocode | Yes | Algorithm 1 Smooth IGW (Zhu & Mineiro, 2022) ... Algorithm 2 Capped IGW ... Algorithm 3 Normalization CS to compute βt ... Algorithm 4 Sampling routine |
| Open Source Code | Yes | Code to reproduce all experiments available at https://github.com/mrucker/onoff_experiments. |
| Open Datasets | Yes | We perform the online regret experiment using twenty regression datasets hosted on Open ML (Vanschoren et al., 2014) and released under a CC-BY2 license. The exact data ids for these datasets are: 150, 422, 1187, 41540, 41540, 42225, 42225, 44025, 44031, 44056, 44059, 44069, 44140, 44142, 44146, 44148, 44963, 44964, 44973, and 44977. |
| Dataset Splits | Yes | To train the offline models data exhaust is split 80%-10%-10% for training, validation and testing respectively. |
| Hardware Specification | No | The paper does not specify any particular hardware components such as GPU models, CPU types, or memory used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Py Torch (Paszke et al., 2019)' but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For Smooth IGW we use η := 0.3 and select τ from the set {2, 3.76, 7.05, 13.24, 24.87, 46.7, 87.7, 164.69, 309.27, 580.77, 1090.6, 2048}. For Capped IGW we use η := 0.3 and select τ from {6, 9.57, 15.28, 24.37, 38.89, 62.05, 99.01, 157.98, 252.08, 402.21, 641.77, 1024}. ... To optimize θ we use a mean squared error loss with Adam (Kingma & Ba, 2014) in Py Torch (Paszke et al., 2019). ... The offline experiment uses a more complex form for a (x; θ) than the online learners. Rather than one linear layer with a sigmoid output the offline learners use a three layer feedforward neural network with width equal to the number of features in a dataset, Re LU activation functions, and a sigmoid output. |