Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
An Efficient Algorithm for Deep Stochastic Contextual Bandits
Authors: Tan Zhu, Guannan Liang, Chunjiang Zhu, Haining Li, Jinbo Bi11193-11201
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments have been performed to demonstrate the effectiveness and efficiency of the proposed algorithm on multiple real-world datasets. and Experiments We have performed extensive experiments to confirm the effectiveness and computational efficiency of the proposed method, SSGD-SCB with a DNN reward function. |
| Researcher Affiliation | Academia | Tan Zhu, Guannan Liang, Chunjiang Zhu, Haining Li, Jinbo Bi Department of Computer Science and Engineering, University of Connecticut, Storrs, CT, USA EMAIL, EMAIL, EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: SSGD-SCB |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the described methodology. It mentions 'Vowpal Wabbit' as a baseline system, but not for its own implementation. |
| Open Datasets | Yes | We use the CIFAR-10 dataset (Simonyan and Zisserman 2014), which has been widely used for benchmarking non-convex optimization algorithms. |
| Dataset Splits | No | The paper specifies training and test sets: 'For both CIFAR-10 and CIFAR-10+N data, 50K samples are selected as the training set Dtrain while the rest 10K samples form the test set Dtest.' However, it does not explicitly mention a validation set or its split. |
| Hardware Specification | Yes | We implement all the algorithms in Py Torch and test on a server equipped with Intel Xeon Gold 6150 2.7GHz CPU, 192GB RAM, and an NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions 'We implement all the algorithms in Py Torch' but does not provide a specific version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The reward functions of the above algorithms are modeled by a variant of VGG-11 with batch normalization, which contains 9 weight layers and 9.2 million learnable parameters (see Appendix C for the detailed structure and the hyper-parameter settings). |