Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Structural Causal Bandits: Where to Intervene?
Authors: Sanghack Lee, Elias Bareinboim
NeurIPS 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present empirical results demonstrating that the selection of arms based on POMISs makes standard MAB solvers converge faster to an optimal arm. |
| Researcher Affiliation | Academia | Sanghack Lee Department of Computer Science Purdue University EMAIL Elias Bareinboim Department of Computer Science Purdue University EMAIL |
| Pseudocode | Yes | Algorithm 1 Algorithm enumerating all POMISs with JG, Y K |
| Open Source Code | Yes | All the code is available at https://github.com/sanghack81/SCMMAB-NIPS2018 |
| Open Datasets | No | The paper uses synthetic SCM-MAB instances that are generated and parameterized according to specifications in Appendix D, rather than referring to a named, publicly available dataset with concrete access information or citation. |
| Dataset Splits | No | The paper describes a multi-armed bandit simulation setup and does not refer to traditional train/validation/test dataset splits, which are not applicable in this context. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU/CPU models, memory, or processing units used for running the simulations. |
| Software Dependencies | No | The paper mentions using 'kl-UCB' and 'Thompson sampling' algorithms, but it does not specify any software dependencies with version numbers (e.g., Python version, library versions). |
| Experiment Setup | Yes | We set the horizon large enough so as to observe near convergence, and repeat each simulation 300 times. We set the horizon T = 1000 for Task 1 and Task 2, and T = 10000 for Task 3. We repeat each simulation 300 times. |