reproducibilityindex.ai

Omniscient Debugging for Cognitive Agent Programs

Authors: Vincent J. Koeman, Koen V. Hindriks, Catholijn M. Jonker

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We design a tracing mechanism for efficiently storing and exploring agent program runs. We are the first to demonstrate that this mechanism does not affect program runs by empirically establishing that the same tests succeed or fail.In this section, we empirically investigate and compare the performance and reproduction of a failure in an agent (system) with and without our state change and source location tracing mechanism enabled.
Researcher Affiliation	Academia	Vincent J. Koeman, Koen V. Hindriks and Catholijn M. Jonker Delft University of Technology, Mekelweg 4, 2628CD, Delft, The Netherlands {v.j.koeman, k.v.hindriks, c.m.jonker}@tudelft.nl
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We used two sets of agent systems programmed in GOAL that control robots in the BW4T environment. They were created by pairs of first-year Computer Science bachelor students and handed-in with accompanying tests.UT3 environment [Hindriks et al., 2011]BW4T; Johnson et al. [2009]) environment
Dataset Splits	No	The paper describes running a "test set" multiple times for evaluation but does not specify explicit training, validation, and test dataset splits as typically defined for machine learning models.
Hardware Specification	Yes	All evaluations were performed on a Linux server with a quad-core Intel i7 processor and 6GB of RAM.
Software Dependencies	No	The paper mentions 'GOAL' and 'UT3 environment' as software platforms used, but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	We applied our method to both sets of agents, with parameter B set to 100, T to 60 seconds, N to 10, and M to 1000. These parameters were chosen after an iterative process of running experiments to minimize the runtime whilst making sure that unreproducible nor new failures would be found in the repetition part of our method (see also the step to increase parameter B in Fig. 1).