Analysis and Design of Thompson Sampling for Stochastic Partial Monitoring
Authors: Taira Tsuchiya, Junya Honda, Masashi Sugiyama
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we compare the performance of TSPM with existing algorithms in numerical experiments, and show that TSPM outperforms existing algorithms. |
| Researcher Affiliation | Academia | Taira Tsuchiya The University of Tokyo RIKEN AIP tsuchiya@ms.k.u-tokyo.ac.jp Junya Honda The University of Tokyo RIKEN AIP honda@edu.k.u-tokyo.ac.jp Masashi Sugiyama RIKEN AIP The University of Tokyo sugi@k.u-tokyo.ac.jp |
| Pseudocode | Yes | Algorithm 1: TSPM Algorithm Algorithm 2: Accept-Reject Sampling Algorithm 3: Sampling from gt(p) |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | The paper describes a 'dynamic pricing problem' and 'dp-easy and dp-hard games' as its experimental setup, which are simulated environments rather than publicly available datasets with direct access information or formal citations. |
| Dataset Splits | No | The paper does not specify explicit training, validation, or test dataset splits in terms of percentages, sample counts, or references to predefined splits. It describes the simulation setup (e.g., time horizon, number of trials) but not data partitioning for a dataset. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | For TSPM, we set λ = 0.001, and R was selected from {0.01, 1.0}. ... sampling from the proposal distribution in Algorithm 3, we used an initialization that takes each action n = 10A times. |