Graph Convolutional Reinforcement Learning
Authors: Jiechuan Jiang, Chen Dun, Tiejun Huang, Zongqing Lu
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we show that our method substantially outperforms existing methods in a variety of cooperative scenarios. For the experiments, we adopt a grid-world platform MAgent (Zheng et al., 2017). |
| Researcher Affiliation | Collaboration | 1Peking University, 2Rice University. This work was supported in part by NSF China under grant 61872009, Huawei Noah s Ark Lab, and Peng Cheng Lab. |
| Pseudocode | No | No pseudocode or algorithm blocks found. |
| Open Source Code | Yes | The code of DGN is available at https://github.com/PKU-AI-Edge/DGN/. |
| Open Datasets | Yes | For the experiments, we adopt a grid-world platform MAgent (Zheng et al., 2017). The paper describes three experimental scenarios: battle, jungle, and routing, which are built environments within the MAgent platform. |
| Dataset Splits | No | No explicit mention of a 'validation' split or dataset was found. The paper describes training and testing phases but does not explicitly specify a validation set for hyperparameter tuning or early stopping. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for experiments are provided in the paper. |
| Software Dependencies | No | No specific software dependencies with version numbers are provided in the paper. It mentions 'Tensor Flow' but without a version. |
| Experiment Setup | Yes | Table 4: Hyperparameters in Appendix A summarizes detailed hyperparameters used by DGN and the baselines, including discount (γ), batch size, buffer capacity, β, ϵ and decay, optimizer (Adam), learning rate, # neighbors, # convolutional layers, # attention heads, τ, λ, κ, # encoder MLP layers and units, Q network type, MLP activation, and initializer. |