Learning to Design Games: Strategic Environments in Reinforcement Learning
Authors: Haifeng Zhang, Jun Wang, Zhiming Zhou, Weinan Zhang, Yin Wen, Yong Yu, Wenxin Li
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on a Maze game design task show the effectiveness of the proposed algorithms in generating diverse and challenging Mazes against various agent settings. (Abstract) and 4 Experiments with Maze Design (Section title) |
| Researcher Affiliation | Academia | 1 Peking University 2 University College London 3 Shanghai Jiao Tong University |
| Pseudocode | No | The paper describes algorithms in prose but does not include any explicit pseudocode blocks or algorithm figures. |
| Open Source Code | Yes | Our experiment is repeatable and the code is at goo.gl/o9Mr DN. |
| Open Datasets | No | Our experiment is conducted on PCs with common CPUs. We implement our experiment environment using Keras-RL [Plappert, 2016] backed by Keras and Tensorflow. and In our experiment, we consider a use case of designing Maze game to test our solutions over the transition gradient method and the generative framework respectively. The paper describes generating environments, not using a pre-existing public dataset for training. |
| Dataset Splits | No | The paper does not specify traditional training/validation/test dataset splits because its experiments involve generating environments and training agents within them, rather than using a fixed, pre-split dataset. |
| Hardware Specification | No | Our experiment is conducted on PCs with common CPUs. |
| Software Dependencies | No | We implement our experiment environment using Keras-RL [Plappert, 2016] backed by Keras and Tensorflow. |
| Experiment Setup | No | The paper describes the general experimental environment (Maze game, agent types, objective) and states that deep neural networks are used for modeling, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, network architecture details like number of layers/units, optimization settings) for the models used in the experiments. |