Double Doubly Robust Thompson Sampling for Generalized Linear Contextual Bandits
Authors: Wonyoung Kim, Kyungbok Lee, Myunghee Cho Paik
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct empirical studies using synthetic data and real examples, demonstrating the effectiveness of our algorithm. |
| Researcher Affiliation | Collaboration | Wonyoung Kim1, Kyungbok Lee2, Myunghee Cho Paik2, 3 * 1 Department of Industrial Engineering and Operations Research, Columbia University 2 Department of Statistics, Seoul National University 3 Shepherd23 Inc. |
| Pseudocode | Yes | Algorithm 1: Double Doubly Robust Thompson Sampling for Generalized Linear Contextual Bandits (DDRTS-GLM) |
| Open Source Code | No | The paper does not provide any statement about releasing the source code for the methodology or a link to a code repository. |
| Open Datasets | Yes | We use the Forest Cover Type dataset from the UCI Machine Learning repository (Blake, Keogh, and Merz 1999), as used by Filippi et al. (2010)... The Yahoo! Front Page Today Module User Click Log Dataset (Yahoo! Webscope 2009) |
| Dataset Splits | No | The paper describes experiments within a bandit setting, where data is processed sequentially over 'T rounds'. It does not specify distinct training, validation, and testing splits as commonly done in supervised learning contexts. |
| Hardware Specification | No | The paper does not specify any hardware used for running the experiments (e.g., GPU models, CPU types, or cloud computing instances with specifications). |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments. |
| Experiment Setup | Yes | In each algorithm, we choose the best hyperparameter from {0.001, 0.01, 0.1, 1}. The proposed method requires a positive threshold γ for resampling; however, we do not tune γ but fix the value to be 1/(N +1). |