Transportability for Bandits with Data from Different Environments
Authors: Alexis Bellot, Alan Malek, Silvia Chiappa
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed approach on several synthetic scenarios inspired by the literature on clinical trials and advertising. We compare Thompson sampling with additional data sources (t TS, Alg. 1) with Thompson sampling with uninformative priors (TS) [38], a KL-UCB [9] algorithm with uninformative priors (UCB), and as a baseline also include the algorithm that chooses actions uniformly at random (Uniform)9. For all algorithms, we measure their regrets RT , averaged over 10 repetitions. |
| Researcher Affiliation | Industry | Alexis Bellot, Alan Malek, Silvia Chiappa Google Deep Mind London, UK abellot@google.com |
| Pseudocode | Yes | Algorithm 1 Thompson Sampling with Transportability (t TS) Input: Selection diagrams t G , a, G ,b, . . . u, prior data v : p va, vb, . . . q, decision variable X, reward variable Y , horizon T. for rounds t 1, 2, . . . , T do Approximate P pξ, θ | v, vxp1q, . . . , vxpt 1qq Sample ξptq, θptq P pξ, θ | v, vxp1q, . . . , vxpt 1qq xptq Ð arg maxx EP Yx | ξptq, θptq Take action xptq and observe vxptq in π end for |
| Open Source Code | No | No mention of code availability or repository links for the described methodology. |
| Open Datasets | No | We evaluate the proposed approach on several synthetic scenarios inspired by the literature on clinical trials and advertising. |
| Dataset Splits | No | Specifically, with this model, 1000 prior data samples are given from an environment πa that differs in the causal assignment of Z in comparison with the deployment environment π . (This describes the source of data, not specific training/validation/test splits). |
| Hardware Specification | No | No specific hardware details are mentioned for running the experiments. |
| Software Dependencies | No | No software names with version numbers are provided. |
| Experiment Setup | No | Details on all data generating mechanisms and a discussion on mis-specification and limitations of the proposed approach can be found in Appendix D and Appendix B, respectively. (These provide context for the experiments but lack specific hyperparameter values for the algorithms themselves). |