Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Settling Decentralized Multi-Agent Coordinated Exploration by Novelty Sharing
Authors: Haobin Jiang, Ziluo Ding, Zongqing Lu
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that MACE achieves superior performance in three multi-agent environments with sparse rewards. |
| Researcher Affiliation | Academia | Haobin Jiang1, Ziluo Ding1,2, Zongqing Lu1 1School of Computer Science, Peking University 2Beijing Academy of Artificial Intelligence EMAIL |
| Pseudocode | No | The paper does not contain a clearly labeled pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide an explicit statement or a link regarding the availability of its source code. |
| Open Datasets | No | The paper states 'We design three tasks in Grid World', implying a custom environment not explicitly made publicly available. While Overcooked and SMAC are cited and are known public environments, the inclusion of a custom Grid World environment for which no access information is provided means not all data used is publicly available. |
| Dataset Splits | No | The paper mentions 'Each curve shows the mean reward of several runs with different random seeds (5 runs in Pass, 8 runs in Secret Room and Multi Room) and shaded regions indicate standard error,' but does not specify train/validation/test dataset splits. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU model, CPU model, memory). |
| Software Dependencies | No | The paper mentions 'we implement PPO leveraging GRU' and uses RND, but it does not provide specific version numbers for software libraries, frameworks, or languages used. |
| Experiment Setup | No | The paper mentions that 'λ is a hyperparameter' but does not provide its specific value, nor other detailed hyperparameters (e.g., learning rate, batch size) or system-level training settings. |