Successor Features Based Multi-Agent RL for Event-Based Decentralized MDPs

Authors: Tarun Gupta, Akshat Kumar, Praveen Paruchuri6054-6061

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For validation purposes, we test our approach on a large multi-agent coverage problem which models schedule coordination of agents in a real urban subway network and achieves better quality solutions than previous best approaches. We test our approach on a large multi-agent coverage problem, show its effectiveness against previous approaches, and the ability to do transfer learning. We experiment on a multi-agent coverage domain. Figure 2: Solution quality comparisons between GKP, Dec-ESR and No Event Critic (NEC) approach.
Researcher Affiliation Academia 1School of Information Systems, Singapore Management University, Singapore 2Machine Learning Lab, Kohli Center on Intelligent Systems, IIIT Hyderabad, India
Pseudocode No Algorithm 1 in the longer version of the paper highlights the learning algorithm in greater detail. This indicates pseudocode exists, but it is explicitly stated to be in a 'longer version of the paper' and not present in this document.
Open Source Code No The paper does not explicitly state that source code for the described methodology is publicly available, nor does it provide a link to a repository.
Open Datasets Yes We experiment on a multi-agent coverage domain (Yehoshua and Agmon 2016; Galceran and Carreras 2013) under uncertainty and partial observability. For testing the scalability of our Dec-ESR approach, we experimented with the multi-agent coverage problem introduced by Gupta, Kumar, and Paruchuri (2018). These citations define the problem domain and its parameters which serve as the experimental environment.
Dataset Splits No The paper discusses training iterations but does not specify explicit train/validation/test dataset splits (percentages or sample counts) typically found in supervised learning.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not specify software dependencies (e.g., library names with version numbers) used for the experiments.
Experiment Setup No The paper describes the setup of the multi-agent coverage problem with varying complexity (e.g., number of MRT lines, stations, shared stations) and outlines the neural network architecture and loss functions, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or optimizer details for the training process.