Scalable Initial State Interdiction for Factored MDPs
Authors: Swetasudha Panda, Yevgeniy Vorobeychik
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate the effectiveness of our approaches. and 8 Experiments We evaluate our MDP interdiction algorithms on several instances of three problem domains from the international planning competition (IPC 2014)... |
| Researcher Affiliation | Academia | Swetasudha Panda and Yevgeniy Vorobeychik Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN {swetasudha.panda,yevgeniy.vorobeychik}@vanderbilt.edu |
| Pseudocode | Yes | Algorithm 1 Interdiction using Linear Action-Value Function Learning, Algorithm 2 Non-Linear Value Function Learning and Greedy Local Search, Algorithm 3 Interdiction with Local Linear Approximation |
| Open Source Code | No | The paper does not contain an unambiguous statement where the authors release the code for the work described in this paper, nor does it provide a direct link to a source-code repository. |
| Open Datasets | Yes | We evaluate our MDP interdiction algorithms on several instances of three problem domains from the international planning competition (IPC 2014): a) sysadmin b) academic advising and c) wildfire. |
| Dataset Splits | No | The paper discusses concepts like 'train', 'validation', and 'test' in the context of general machine learning (e.g., 'To incorporate generalization, linear and non-linear function approximation are commonly used'), and mentions 'batch size' for learning. However, it does not provide specific dataset split percentages, sample counts, or references to predefined splits for the experimental data used in their evaluations on the sysadmin, academic advising, and wildfire domains. |
| Hardware Specification | Yes | The experiments were run on a 2.4GHz hyperthreaded 8-core Ubuntu Linux machine with 16 GB RAM |
| Software Dependencies | Yes | CPLEX version 12.51 for MILP instances and Tensor Flow for learning algorithms [Abadi et al., 2016]. |
| Experiment Setup | Yes | We train the learning algorithms with ϵ0 = 1, η = 0.01 and the RMSProp optimizer for the neural networks. The batch size | ˆD| increases from 40 to 400 with problem size. |