Scalable Planning and Learning for Multiagent POMDPs

Authors: Christopher Amato, Frans Oliehoek

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here, we empirically investigate the effectiveness of our factorization methods by comparing them to non-factored methods in the planning and learning settings. Experimental results show that we are able to provide high quality solutions to large multiagent planning and learning problems.
Researcher Affiliation Academia Christopher Amato CSAIL, MIT Cambridge, MA 02139 camato@csail.mit.edu Frans A. Oliehoek Informatics Institute, University of Amsterdam Dept. of CS, University of Liverpool frans.oliehoek@liverpool.ac.uk
Pseudocode No The paper describes the algorithms conceptually and references modifications to existing functions but does not provide any formal pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about releasing code or links to a code repository.
Open Datasets No The paper describes custom problem settings ('firefighting problems', 'sensor network problems') used in the experiments but does not provide access information (links, citations) for publicly available datasets.
Dataset Splits No The paper mentions 'Each experiment was run for a given number of simulations, the number of samples used at each step to choose an action, and averaged over a number of episodes.' but does not specify any training/validation/test dataset splits.
Hardware Specification Yes Experiments were run on a single core of a 2.5 GHz machine with 8GB of memory.
Software Dependencies No The paper mentions comparing 'factored representations to the flat version using POMCP' and using 'the same code base', but it does not specify any software names with version numbers.
Experiment Setup Yes Each experiment was run for a given number of simulations, the number of samples used at each step to choose an action, and averaged over a number of episodes. We report undiscounted return with the standard error. For the BA-MPOMDPs, H = 10, 50.