Planning and Learning with Stochastic Action Sets

Authors: Craig Boutilier, Alon Cohen, Avinatan Hassidim, Yishay Mansour, Ofer Meshi, Martin Mladenov, Dale Schuurmans

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we offer a simple empirical demonstration of the importance of accounting for stochastic action availability when computing an MDP policy. Additional discussion and full proofs of all results can be found in a longer version of this paper [Boutilier et al., 2018].
Researcher Affiliation Industry Craig Boutilier, Alon Cohen, Avinatan Hassidim, Yishay Mansour, Ofer Meshi, Martin Mladenov and Dale Schuurmans Google Research {cboutilier,aloncohen,avinatan,mansour,meshi,schuurmans}@google.com
Pseudocode No No clearly labeled pseudocode or algorithm blocks were found. Algorithms are described in paragraph form.
Open Source Code No The paper does not provide any concrete access to source code for the described methodology. It mentions open-source in the context of existing tools but not for their own implementation.
Open Datasets No The paper uses "a real-world road network (Fig. 1) in the San Francisco Bay Area" for its empirical illustration but does not provide access information (link, DOI, citation) for this specific dataset.
Dataset Splits No The empirical illustration describes a routing problem without specifying dataset splits (e.g., training, validation, test percentages or sample counts).
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are mentioned.
Software Dependencies No No specific software dependencies with version numbers are mentioned.
Experiment Setup Yes The optimal policies for different choices p = 0.1, 0.2 and 0.4 are depicted in Fig. 1, where line thickness and color indicate traversal probabilities under the corresponding optimal policies. We see that lower values of p lead to policies with more redundancy (i.e., more alternate routes).