Contextual Bandits with Large Action Spaces: Made Practical
Authors: Yinglun Zhu, Dylan J Foster, John Langford, Paul Mineiro
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a large-scale empirical evaluation, and show that our approach typically enjoys superior performance and efficiency compared to standard baselines. |
| Researcher Affiliation | Collaboration | 1University of Wisconsin-Madison 2Microsoft Research NYC. |
| Pseudocode | Yes | Algorithm 1 Spanner Greedy, Algorithm 2 Spanner IGW, Algorithm 3 Reweighted Spanner, Algorithm 4 IGW-Arg Max, Algorithm 5 Approximate Barycentric Spanner (Awerbuch & Kleinberg, 2008) |
| Open Source Code | Yes | Code to reproduce all results is available at https://github.com/pmineiro/linrepcb. |
| Open Datasets | Yes | We conduct experiments on three datasets, whose details are summarized in Table 1. oneshotwiki (Singh et al., 2012; Vasnetsov, 2018) is a named-entity recognition task... amazon-3m (Bhatia et al., 2016) is an extreme multi-label dataset... |
| Dataset Splits | No | The paper describes an online learning setting and refers to 'progressive-validation reward' but does not specify explicit train/validation/test dataset splits with percentages or sample counts. |
| Hardware Specification | Yes | CPU timings use batch size 1 on an Azure STANDARD D4 V2 machine. GPU timings use batch size 1024 on an Azure STANDARD NC6S V2 (Nvidia P100-based) machine. |
| Software Dependencies | No | The paper mentions software like 'PyTorch' and the 'sentence transformers' package but does not provide specific version numbers for these dependencies. |
| Experiment Setup | No | The paper states that hyperparameters are optimized using random search and mentions the use of Adam optimizer, but it does not provide the specific values for these hyperparameters (e.g., learning rate, batch size) or the parameters of the optimizer. |