Double Doubly Robust Thompson Sampling for Generalized Linear Contextual Bandits

Authors: Wonyoung Kim, Kyungbok Lee, Myunghee Cho Paik

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct empirical studies using synthetic data and real examples, demonstrating the effectiveness of our algorithm.
Researcher Affiliation Collaboration Wonyoung Kim1, Kyungbok Lee2, Myunghee Cho Paik2, 3 * 1 Department of Industrial Engineering and Operations Research, Columbia University 2 Department of Statistics, Seoul National University 3 Shepherd23 Inc.
Pseudocode Yes Algorithm 1: Double Doubly Robust Thompson Sampling for Generalized Linear Contextual Bandits (DDRTS-GLM)
Open Source Code No The paper does not provide any statement about releasing the source code for the methodology or a link to a code repository.
Open Datasets Yes We use the Forest Cover Type dataset from the UCI Machine Learning repository (Blake, Keogh, and Merz 1999), as used by Filippi et al. (2010)... The Yahoo! Front Page Today Module User Click Log Dataset (Yahoo! Webscope 2009)
Dataset Splits No The paper describes experiments within a bandit setting, where data is processed sequentially over 'T rounds'. It does not specify distinct training, validation, and testing splits as commonly done in supervised learning contexts.
Hardware Specification No The paper does not specify any hardware used for running the experiments (e.g., GPU models, CPU types, or cloud computing instances with specifications).
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used in the experiments.
Experiment Setup Yes In each algorithm, we choose the best hyperparameter from {0.001, 0.01, 0.1, 1}. The proposed method requires a positive threshold γ for resampling; however, we do not tune γ but fix the value to be 1/(N +1).