Causal Bandits: Learning Good Interventions via Causal Inference

Authors: Finnian Lattimore, Tor Lattimore, Mark D. Reid

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experiments We compare Algorithms 1 and 2 with the Successive Reject algorithm of Audibert and Bubeck (2010), Thompson Sampling and UCB under a variety of conditions. ... For each experiment, we show the average regret over 10,000 simulations with error bars displaying three standard errors.
Researcher Affiliation Collaboration Finnian Lattimore Australian National University and Data61/NICTA finn.lattimore@gmail.com Tor Lattimore Indiana University, Bloomington tor.lattimore@gmail.com Mark D. Reid Australian National University and Data61/NICTA mark.reid@anu.edu.au
Pseudocode Yes Algorithm 1 Parallel Bandit Algorithm
Open Source Code Yes The code is available from <https://github.com/finnhacks42/causal_bandits>
Open Datasets No No concrete access information (specific link, DOI, repository name, formal citation with authors/year) for a publicly available or open dataset was found. The paper describes using a synthetic model for its experiments: 'Throughout we use a model in which Y depends only on a single variable X1 (this is unknown to the algorithms). Yt Bernoulli( 1 2 + ε) if X1 = 1 and Yt Bernoulli( 1 2 ε ) otherwise, where ε = q1ε/(1 q1).'
Dataset Splits No No specific dataset split information (percentages, sample counts, citations to predefined splits) was provided. The paper describes a sequential decision problem where data is collected iteratively over 'T rounds' rather than using pre-defined dataset splits.
Hardware Specification No No specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running experiments were mentioned.
Software Dependencies No No specific ancillary software details (e.g., library or solver names with version numbers) were mentioned.
Experiment Setup Yes For the first T/2 rounds it chooses do() to collect observational data. ... In Figure 2a we fix the number of variables N and the horizon T and compare the performance of the algorithms as m increases. ... Throughout we use a model in which Y depends only on a single variable X1 (this is unknown to the algorithms). Yt Bernoulli( 1 2 + ε) if X1 = 1 and Yt Bernoulli( 1 2 ε ) otherwise, where ε = q1ε/(1 q1). ... Input: Total rounds T and N. (Algorithm 1) ... Input: T, η [0, 1]A, B [0, )A (Algorithm 2)