reproducibilityindex.ai

A Coupled Flow Approach to Imitation Learning

Authors: Gideon Joseph Freund, Elad Sarafian, Sarit Kraus

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate CFIL on the standard Mujoco benchmarks (Todorov et al., 2012), ﬁrst comparing it to state-of-the-art imitation methods, including Value DICE (Kostrikov et al., 2019) and their optimized implementation of DAC (Kostrikov et al., 2018), along with a customary behavioral cloning (BC) baseline.
Researcher Affiliation	Academia	1Department of Computer Science, Bar-Ilan University, Israel. Correspondence to: Gideon Freund <gideonfreund@gmail.com>.
Pseudocode	Yes	Our resulting algorithm, Coupled Flow Imitation Learning (CFIL). It is summarized in Algorithm 1
Open Source Code	Yes	Code for reproducibility of CFIL, including a detailed description for reproducing our environment, is available at https: //github.com/gfreund123/cfil.
Open Datasets	Yes	We use Value DICE s original expert demonstrations, with exception to the Humanoid environment, for which we train our own expert, since they did not originally evaluate on it. We use Value DICE s open-source implementation to comfortably run all three baselines. NDI (Kim et al., 2021b) would be the ideal candidate for comparison, given the similarities, however no code was available.
Dataset Splits	No	The paper specifies training details and evaluation metrics (e.g., "evaluating over 10 episodes after each") but does not explicitly mention distinct training/validation/test splits with percentages or counts for a dataset, typical in supervised learning. For RL, evaluation episodes on the environment serve a similar purpose to testing, but a dedicated validation split is not specified.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instance types) used to run the experiments.
Software Dependencies	No	The paper mentions software like "Spinning Ups s (Achiam, 2018) SAC (Haarnoja et al., 2018)" and the "Adam optimizer (Kingma & Ba, 2014)" but does not provide specific version numbers for these libraries or frameworks. It also refers to an "open-source implementation (Bliznashki, 2019)" for MAF, but this is a citation, not a version number for the software dependency itself.
Experiment Setup	Yes	Our density update rate is 10 batches of 100, every 1000 timesteps. We use the Adam optimizer (Kingma & Ba, 2014) with a learning rate of 0.001. For squashing we use σ = 6tanh( x/15), while the smoothing and regularization coefﬁcients are 0.5 and 1 respectively. For all algorithms, we run 80 epochs, each consisting of 4000 timesteps, evaluating over 10 episodes after each. We do this across 5 random seeds and plot means and standard deviations.