Beyond Strong-Cyclic: Doing Your Best in Stochastic Environments
Authors: Benjamin Aminof, Giuseppe De Giacomo, Sasha Rubin, Florian Zuleger
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We introduce best-effort policies for (not necessarily Markovian) stochastic domains. These generalize strong-cyclic policies by taking advantage of stochasticity even if the goal cannot be reached with probability 1. We compare such policies with optimal policies, i.e., policies that maximize the probability that the goal is achieved, and show that optimal policies are best-effort, but that the converse is false in general. All this shows that best-effort policies are robust to changes in the probabilities, as long as the support is unchanged. Theorem 1. There exists a SBE policy for (D, G). The proof of this result is an adaptation of the fact that besteffort policies exist in the context of synthesis under assumptions [Aminof et al., 2020a]. |
| Researcher Affiliation | Academia | Benjamin Aminof1 , Giuseppe De Giacomo2 , Sasha Rubin3 and Florian Zuleger1 1TU Vienna, Austria 2University of Rome La Sapienza , Italy 3University of Sydney, Australia {aminof, zuleger}@forsyte.at, degiacomo@diag.uniroma1.it, sasha.rubin@sydney.edu.au |
| Pseudocode | No | The algorithm for Part 1 is simple. Given the support : Obs Act 2Obs \ { } and LTL/LTLf formula ϕ: Step 1. Fix a similar finite MDP D = (F, A, s0, Pr) by letting Pr(s, a) be, e.g., the uniform distribution over (s, a): Pr(s, a)(s ) := 1/| (s, a)| for s (s, a). Step 2. Return a finite-state optimal σ for the stochastic planning problem (D, ϕ). This can be computed using any standard technique, e.g., [Bianco and de Alfaro, 1995]. This is a textual description of steps, not a structured pseudocode block. |
| Open Source Code | No | The paper is theoretical and does not mention releasing any source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not use or reference any datasets for training or evaluation. |
| Dataset Splits | No | The paper is theoretical and does not involve dataset splits (training, validation, test) for empirical evaluation. |
| Hardware Specification | No | The paper is theoretical and does not specify any hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not specify software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations. |