Follow-ups Also Matter: Improving Contextual Bandits via Post-serving Contexts
Authors: Chaoqi Wang, Ziyu Ye, Zhe Feng, Ashwinkumar Badanidiyuru Varadaraja, Haifeng Xu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive empirical tests on both synthetic and real-world datasets demonstrate the significant benefit of utilizing post-serving contexts as well as the superior performance of our algorithm over the state-of-the-art approaches.7 Experiments This section presents a comprehensive evaluation of our proposed po Lin UCB algorithm on both synthetic and real-world data, demonstrating its effectiveness in incorporating follow-up information and outperforming the Lin UCB(bϕ) variant. |
| Researcher Affiliation | Collaboration | University of Chicago1 Google Research2 Google3 {chaoqi, ziyuye, haifengxu}@uchicago.edu {zhef, ashwinkumarbv}@google.com |
| Pseudocode | Yes | Algorithm 1 po Lin UCB (Linear UCB with post-serving contexts) |
| Open Source Code | No | No explicit statement about releasing source code or a direct link to a repository for the described methodology was found. |
| Open Datasets | Yes | The evaluation was conducted on a real-world dataset, Movie Lens (Harper and Konstan, 2015) |
| Dataset Splits | No | The paper mentions dividing user feature vectors into pre-serving and post-serving contexts, but does not provide specific details on train/validation/test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology). |
| Hardware Specification | No | No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running experiments were found. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'neural network' but does not provide specific software dependencies with version numbers (e.g., 'PyTorch 1.9'). |
| Experiment Setup | Yes | Evaluation Setup. We adopt three different synthetic environments... In each environment, the dimensions of the pre-serving context (dx) and the post-serving context (dz) are of 100 and 5, respectively with 10 arms (K). The evaluation spans T = 1000 or 5000 time steps, and each experiment is repeated with 10 different seeds.We fit the function ϕ(x) using a two-layer neural network with 64 hidden units and ReLU activation. The network was trained using the Adam optimizer with a learning rate of 1e-3. At each iteration, we randomly sampled a user from the dataset... The evaluation spanned T = 500 iterations and repeated with 10 seeds. |