Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
Authors: Duc Thien Nguyen, William Yeoh, Hoong Chuin Lau, Shlomo Zilberstein, Chongjie Zhang
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate them against an existing multi-arm bandit DCOP algorithm on dynamic DCOPs. |
| Researcher Affiliation | Academia | Duc Thien Nguyen School of Information Systems Singapore Management University William Yeoh Department of Computer Science New Mexico State University Hoong Chuin Lau School of Information Systems Singapore Management University Shlomo Zilberstein School of Computer Science University of Massachusetts, Amherst Chongjie Zhang Computer Science and Artificial Intelligence Lab Massachusetts Institute of Technology |
| Pseudocode | No | The paper describes the algorithms using numbered steps and equations (e.g., 'Step 1', 'Step 2' for RVI Q-learning) but does not present them within a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodologies is publicly available. |
| Open Datasets | No | The paper evaluates algorithms on a 'sensor network domain' but does not provide concrete access information (link, DOI, specific repository, or citation with author/year) for the dataset used, nor does it identify it as a well-known public dataset. |
| Dataset Splits | No | The paper describes experimental variations but does not provide specific details regarding train, validation, or test dataset splits, percentages, or methodology for data partitioning. |
| Hardware Specification | Yes | We run our experiments on a 64 core Linux machine with 2GB of memory and evaluate the algorithms on our motivating sensor network domain. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers (e.g., programming language versions, library versions, or specific solver versions). |
| Experiment Setup | Yes | We vary the number of agents |A| and the number of constraints |F| by varying the topology of the sensor network. We used a 4-connected grid topology, where each sensor has a constraint with each sensor in its four cardinal directions. We fixed the number of rows in the grid to 3 and varied the number of columns from 1 to 4. We also varied the number of values per agent |Di| of each agent ai from 4 to 8 and the number of local states per constraint |Si| from 2 to 10. |