Characterizing Optimal Mixed Policies: Where to Intervene and What to Observe
Authors: Sanghack Lee, Elias Bareinboim
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we investigate several properties of the class of mixed policies and provide an efficient and effective characterization, including optimality and non-redundancy. Specifically, we introduce a graphical criterion to identify unnecessary contexts for a set of actions, leading to a natural characterization of non-redundancy of mixed policies. We then derive sufficient conditions under which one strategy can dominate the other with respect to their maximum achievable expected rewards (optimality). This characterization leads to a fundamental understanding of the space of mixed policies and a possible refinement of the agent s strategy so that it converges to the optimum faster and more robustly. |
| Researcher Affiliation | Academia | Sanghack Lee Elias Bareinboim Causal Artificial Intelligence Laboratory Columbia University {sanghacklee,eb}@cs.columbia.edu |
| Pseudocode | Yes | We provide an efficient algorithm for obtaining a unique, maximal, non-redundant MPS (nr-mps, Alg. 2) of a given MPS in Appendix E [28]. |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not mention using any specific datasets for training or provide information about their availability. |
| Dataset Splits | No | The paper is theoretical and does not describe any experimental setups involving training, validation, or test dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any computational experiments or the hardware used to run them. |
| Software Dependencies | No | The paper is theoretical and does not mention specific software dependencies with version numbers that would be needed to replicate experimental results. |
| Experiment Setup | No | The paper is theoretical and does not describe any experimental setups, hyperparameters, or training configurations. |