Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Raising Expectations in GDA Agents Acting in Dynamic Environments
Authors: Dustin Dannenhauer, Hector Munoz-Avila
IJCAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An empirical validation in two variants of domains used in the GDA literature. We evaluate our GDA agent versus alternative GDA agents that either check immediate effects or check for expected states. Our experiments demonstrate improved performance of our GDA agent. |
| Researcher Affiliation | Academia | Dustin Dannenhauer and Hector Munoz-Avila Department of Computer Science and Engineering Lehigh University Bethlehem, PA 18015 USA EMAIL |
| Pseudocode | Yes | The pseudocode for calculating informed expectations is described in Algorithm 1. |
| Open Source Code | No | No statement regarding open-source code availability or repository link found. |
| Open Datasets | No | We use two domains from GDA literature. The ο¬rst domain, which we call Marsworld, is inspired from Mudworld from [Molineaux and Aha, 2014]. The second domain is a slight variant of the Arsonist domain from Paisner et al [Paisner et al., 2013]. |
| Dataset Splits | No | The paper describes experimental domains but does not provide specific train/validation/test dataset splits. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory) used for experiments are provided. |
| Software Dependencies | No | We use HTN task decomposition as in the SHOP planner [Nau et al., 1999] and implemented in the Python version, Py Hop. |
| Experiment Setup | Yes | The following parameters were used in the Marsworld setup: the grid was 10 by 10, the probability of mud was 10%, all distances from start to destination were at least 5 tiles, and magnetic radiation clouds had a 10% probability per turn per tile to appear. The following parameters were used in the Arsonist domain: the domain contained 20 blocks, the start state had every block on the table, each goal was randomly generated where there were 3 towers each with 3 blocks, and the probability of ο¬re was 10%. |