reproducibilityindex.ai

PG-TS: Improved Thompson Sampling for Logistic Contextual Bandits

Authors: Bianca Dumitrascu, Karen Feng, Barbara Engelhardt

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate and compare our PG-TS method with Laplace-TS. We evaluate our algorithm in three scenarios: simulated data sets with parameters sampled from Gaussian and mixed Gaussian distributions, a toy data set based on the Forest Cover Type data set from the UCI repository, and an ofﬂine evaluation method for bandit algorithms that relies on real-world log data.
Researcher Affiliation	Academia	Bianca Dumitrascu Lewis Sigler Institute for Integrative Genomics Princeton University Princeton, NJ 08540 biancad@princeton.edu Karen Feng Department of Computer Science Princeton University Princeton, NJ 08540 karenfeng@princeton.edu Barbara E Engelhardt Department of Computer Science Princeton University Princeton, NJ 08540 bee@princeton.edu
Pseudocode	Yes	Algorithm 1 PG-TS
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or provide a link to a code repository for the methodology described.
Open Datasets	Yes	We further compared these methods using the Forest Cover Type data from the UCI Machine Learning repository [8].
Dataset Splits	No	The paper describes the sequential processing of data for online learning in a bandit setting and mentions the number of trials or events used, but it does not specify traditional train/validation/test dataset splits as commonly found in supervised learning.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU or CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers (e.g., programming languages, libraries, or specific solvers).
Experiment Setup	Yes	We sample from the PG distribution [24, 27] including M = 100 burn-in steps. This number is empirically tuned... We set the hyperparameters b = 0, and B = I10.