reproducibilityindex.ai

f-GAIL: Learning f-Divergence for Generative Adversarial Imitation Learning

Authors: Xin Zhang, Yanhua Li, Ziming Zhang, Zhi-Li Zhang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Compared with IL baselines with various predeﬁned divergence measures, f-GAIL learns better policies with higher data efﬁciency in six physics-based control tasks.
Researcher Affiliation	Academia	Worcester Polytechnic Institute, USA , University of Minnesota, USA$ {xzhang17,yli15,zzhang15}@wpi.edu, zhzhang@cs.umn.edu
Pseudocode	Yes	Algorithm 1 f-GAIL
Open Source Code	Yes	The code for reproducing the experiments are available at https: //github.com/f GAIL3456/f GAIL.
Open Datasets	Yes	six physics-based control tasks, including the Cart Pole [8] from the classic RL literature, and ﬁve complex tasks simulated with Mu Jo Co [32], such as Half Cheetah, Hopper, Reacher, Walker, and Humanoid.
Dataset Splits	Yes	A set of expert state-action pairs is split into 70% training data and 30% validation data. The policy is trained with supervised learning.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or cloud instances) are provided.
Software Dependencies	No	The paper mentions software components like Open AI Gym, Mu Jo Co, Adam, and TRPO, but does not provide specific version numbers for any of them.
Experiment Setup	Yes	For fair comparisons, the policy network structures πθ of all the baselines and f-GAIL are the same in all experiments, with two hidden layers of 100 units each, and tanh nonlinearlities in between. The implementations of reward signal networks and discriminators vary according to baseline architectures, and we delegate these implementation details to Appendix B. All networks were always initialized randomly at the start of each trial.