reproducibilityindex.ai

Generative Exploration and Exploitation

Authors: Jiechuan Jiang, Zongqing Lu4337-4344

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we demonstrate that GENE signiﬁcantly outperforms existing methods in three tasks with only binary rewards, including Maze, Maze Ant, and Cooperative Navigation. Ablation studies verify the emergence of progressive exploration and automatic reversing.
Researcher Affiliation	Academia	Jiechuan Jiang Peking University jiechuan.jiang@pku.edu.cn Zongqing Lu Peking University zongqing.lu@pku.edu.cn
Pseudocode	Yes	Algorithm 1 details the training of GENE.
Open Source Code	No	The paper provides a link for task details and hyperparameters, but it does not provide an explicit statement or link to the source code for the described methodology.
Open Datasets	No	The paper mentions common RL environments (Maze, Maze Ant, Cooperative Navigation) which generate data through interaction, but it does not provide specific access information (link, DOI, repository, or formal citation with authors/year) for a publicly available or open dataset used in the experiments.
Dataset Splits	No	The paper does not provide specific details regarding train, validation, or test dataset splits needed for reproducibility. While it mentions training a VAE, it doesn't specify data splits for the main RL experiments.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments (e.g., GPU/CPU models, memory).
Software Dependencies	No	The paper mentions base RL algorithms (PPO, TRPO, MADDPG) and VAE but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Every episode, the agent starts from the generated states S with a probability p, otherwise from the initial state. The probability p could be seen as how much to change the start state distribution. ... Every T episodes, we train the VAE from the scratch using the states stored in B0 and B1. ... All the experimental results are presented using mean and standard deviation of ﬁve runs.