Steady-State Policy Synthesis for Verifiable Control
Authors: Alvaro Velasquez
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our results are validated using CPLEX simulations on MDPs with over 10000 states. |
| Researcher Affiliation | Academia | Alvaro Velasquez Information Directorate, Air Force Research Laboratory alvaro.velasquez.1@us.af.mil |
| Pseudocode | No | The paper presents mathematical programs (e.g., Program (5), (6), (7)) which describe the optimization problem, but these are not structured as pseudocode or an algorithm block with step-by-step instructions or control flow logic typically found in pseudocode. |
| Open Source Code | No | The paper does not provide any links to open-source code, nor does it explicitly state that code will be released or is available. |
| Open Datasets | No | The paper describes how LMDP instances were 'defined' for simulations by generating random states, actions, transitions, and labels for various state-space sizes. It does not use a publicly available, named dataset, nor does it provide concrete access information for a generated dataset in a downloadable format. |
| Dataset Splits | No | The paper describes how problem instances were generated for simulations, but it does not specify training, validation, or test splits for any dataset, as the experiments involve solving generated problem instances rather than training machine learning models on a dataset. |
| Hardware Specification | Yes | Simulations of program (7) were performed using CPLEX version 12.8 [CPL, 2017] on a machine with a 3.6 GHz Intel Core i7-6850K processor and 128 GB of RAM. |
| Software Dependencies | Yes | Simulations of program (7) were performed using CPLEX version 12.8 [CPL, 2017] on a machine with a 3.6 GHz Intel Core i7-6850K processor and 128 GB of RAM. |
| Experiment Setup | Yes | Each LMDP M = (S, A, T, R, L, Φ L) instance was defined as follows for various state-space sizes |S|. There are four actions associated with each state and taking an action causes a transition to one of two possible random states. Each state-action pair observes a random reward in {1, 2, 3, 4}. Two labels L1, L2 S, |L1| = |L2| = log(|S|) were randomly defined for each instance and used for the steady-state constraints Φ L = (L1, [10/|S|, 1000/|S|]), (L2, {0}). |