Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Follow-ups Also Matter: Improving Contextual Bandits via Post-serving Contexts
Authors: Chaoqi Wang, Ziyu Ye, Zhe Feng, Ashwinkumar Badanidiyuru Varadaraja, Haifeng Xu
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive empirical tests on both synthetic and real-world datasets demonstrate the significant benefit of utilizing post-serving contexts as well as the superior performance of our algorithm over the state-of-the-art approaches.7 Experiments This section presents a comprehensive evaluation of our proposed po Lin UCB algorithm on both synthetic and real-world data, demonstrating its effectiveness in incorporating follow-up information and outperforming the Lin UCB(bĪ) variant. |
| Researcher Affiliation | Collaboration | University of Chicago1 Google Research2 Google3 EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 po Lin UCB (Linear UCB with post-serving contexts) |
| Open Source Code | No | No explicit statement about releasing source code or a direct link to a repository for the described methodology was found. |
| Open Datasets | Yes | The evaluation was conducted on a real-world dataset, Movie Lens (Harper and Konstan, 2015) |
| Dataset Splits | No | The paper mentions dividing user feature vectors into pre-serving and post-serving contexts, but does not provide specific details on train/validation/test dataset splits (e.g., percentages, sample counts, or explicit splitting methodology). |
| Hardware Specification | No | No specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running experiments were found. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'neural network' but does not provide specific software dependencies with version numbers (e.g., 'PyTorch 1.9'). |
| Experiment Setup | Yes | Evaluation Setup. We adopt three different synthetic environments... In each environment, the dimensions of the pre-serving context (dx) and the post-serving context (dz) are of 100 and 5, respectively with 10 arms (K). The evaluation spans T = 1000 or 5000 time steps, and each experiment is repeated with 10 different seeds.We fit the function Ī(x) using a two-layer neural network with 64 hidden units and ReLU activation. The network was trained using the Adam optimizer with a learning rate of 1e-3. At each iteration, we randomly sampled a user from the dataset... The evaluation spanned T = 500 iterations and repeated with 10 seeds. |