Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs

Authors: Duc Thien Nguyen, William Yeoh, Hoong Chuin Lau, Shlomo Zilberstein, Chongjie Zhang

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs.
Researcher Affiliation Academia Duc Thien Nguyen School of Information Systems Singapore Management University William Yeoh Department of Computer Science New Mexico State University Hoong Chuin Lau School of Information Systems Singapore Management University Shlomo Zilberstein School of Computer Science University of Massachusetts, Amherst Chongjie Zhang Computer Science and Artificial Intelligence Lab Massachusetts Institute of Technology
Pseudocode No The paper describes the algorithms using numbered steps and equations (e.g., 'Step 1', 'Step 2' for RVI Q-learning) but does not present them within a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodologies is publicly available.
Open Datasets No The paper evaluates algorithms on a 'sensor network domain' but does not provide concrete access information (link, DOI, specific repository, or citation with author/year) for the dataset used, nor does it identify it as a well-known public dataset.
Dataset Splits No The paper describes experimental variations but does not provide specific details regarding train, validation, or test dataset splits, percentages, or methodology for data partitioning.
Hardware Specification Yes We run our experiments on a 64 core Linux machine with 2GB of memory and evaluate the algorithms on our motivating sensor network domain.
Software Dependencies No The paper does not specify any software dependencies with version numbers (e.g., programming language versions, library versions, or specific solver versions).
Experiment Setup Yes We vary the number of agents |A| and the number of constraints |F| by varying the topology of the sensor network. We used a 4-connected grid topology, where each sensor has a constraint with each sensor in its four cardinal directions. We fixed the number of rows in the grid to 3 and varied the number of columns from 1 to 4. We also varied the number of values per agent |Di| of each agent ai from 4 to 8 and the number of local states per constraint |Si| from 2 to 10.