reproducibilityindex.ai

Fast Imitation via Behavior Foundation Models

Authors: Matteo Pirotta, Andrea Tirinzoni, Ahmed Touati, Alessandro Lazaric, Yann Ollivier

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test FB-IL algorithms across environments from the Deep Mind Control Suite (Tassa et al., 2018a) with multiple imitation tasks, using different IL principles and settings. We show that not only do FB-IL algorithms perform on-par or better than the corresponding state-of-the-art offline imitation learning baselines (Fig. 1), they also solve imitation tasks within a few seconds, which is three orders of magnitude faster than offline IL methods that need to run full RL routines to compute an imitation policy (Fig. 2).
Researcher Affiliation	Industry	Matteo Pirotta, Andrea Tirinzoni & Ahmed Touati* Fundamental AI Research at Meta {pirotta,tirinzoni,atouati}@meta.com Alessandro Lazaric & Yann Ollivier Fundamental AI Research at Meta {lazaric,yol}@meta.com
Pseudocode	No	The paper describes mathematical formulations and algorithmic steps in prose, but it does not include any explicit pseudocode blocks or algorithm listings.
Open Source Code	No	The paper does not contain an explicit statement or a link providing access to the source code for the methodology described in the paper.
Open Datasets	Yes	We used standard unsupervised datasets for the four domains, generated by Random Network Distillation (RND). They can be downloaded following the instructions in the github repository of Yarats et al. (2022) (https://github.com/denisyarats/exorl). ... All the environments considered in this paper are based on the Deep Mind Control Suite (Tassa et al., 2018b).
Dataset Splits	No	The paper describes how expert trajectories are generated and used for imitation, and how models are pre-trained and evaluated, but it does not specify explicit train/validation/test dataset splits (e.g., percentages or counts for distinct sets).
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies	No	The paper mentions software components and methods like TD3, SAC, RND, and Deep Mind Control Suite, but it does not provide specific version numbers for any of the key software libraries or dependencies used (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	D.4 HYPERPARAMETERS: Table 1: Hyperparameters used for FB pretraining. Table 2: Hyperparameters used for IL baselines. Table 3: Hyperparameters used for DIAYN and GOAL-TD3. Table 4: Hyperparameters used for GOAL-GPT and MASKDP.