Learning Behaviors in Agents Systems with Interactive Dynamic Influence Diagrams

Authors: Ross Conroy, Yifeng Zeng, Marc Cavazza, Yingke Chen

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the performance of our approach on two test cases. [...] 5 Experiment Results We first verify the algorithm s performance in the UAV benchmark (|S|=25, |A|=5 and |Ω|=5) the largest problem domain studied in I-POMDP/I-DID, based multiagent planning research and then demonstrate the application in Star Craft. We compare the policy tree learning techniques with either random fill-ins (Rand) or the behavioral compatibility test (BCT) in Alg. 2.
Researcher Affiliation Academia Ross Conroy Teesside University Middlesbrough, UK ross.conroy@tees.ac.uk Yifeng Zeng Teesside University Middlesbrough, UK y.zeng@tees.ac.uk Marc Cavazza Teesside University Middlesbrough, UK m.o.cavazza@tees.ac.uk Yingke Chen University of Georgia Athens, GA, USA ykchen@uga.edu
Pseudocode Yes Algorithm 1 Build Policy Trees [...] Algorithm 2 Branch Fill-in
Open Source Code No The paper does not provide any specific links or statements regarding the open-sourcing of the code for the methodology described.
Open Datasets Yes We will use simplified situations from Star Craft 1 as examples for learning behaviour. The choice of Star Craft is motivated by the availability of replay data [Synnaeve and Bessiere, 2012] [...] We first verify the algorithm s performance in the UAV benchmark [Zeng and Doshi, 2012]
Dataset Splits No The paper mentions collecting data for learning but does not provide specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined splits with citations).
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using the 'BWAPI library' but does not specify a version number for it or any other software dependencies.
Experiment Setup No The paper describes the general experimental setup by mentioning planning horizons and number of simulations, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, epochs) or detailed system-level training configurations.