Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery

Authors: Amin Soleimani Abyaneh, Mahrokh Boroujeni, Hsiu-Chin Lin, Giancarlo Ferrari-Trecate

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, we demonstrate substantial OOS performance improvements for simulated robotic manipulation and navigation tasks. [...] We conduct empirical studies on the LASA dataset (Khansari-Zadeh & Billard, 2011), as well as the more complex Robomimic dataset (Mandlekar et al., 2021). [...] Table 1: Evaluating in-sample and out-of-sample rollouts error on the LASA and the Robomimic datasets.
Researcher Affiliation Academia Amin Abyaneh1 , Mahrokh G. Boroujeni2 , Hsiu-Chin Lin1, Giancarlo Ferrari-Trecate2 1 Mc Gill University,2 Ecole Polytechnique F ed erale de Lausanne (EPFL)
Pseudocode Yes The pseudocode for our method is presented in Alg. 1
Open Source Code Yes See sites.google.com/view/contractive-dynamical-policies for our codebase and highlight of the results.
Open Datasets Yes We conduct empirical studies on the LASA dataset (Khansari-Zadeh & Billard, 2011), as well as the more complex Robomimic dataset (Mandlekar et al., 2021). These datasets are detailed in App. G.
Dataset Splits Yes After training each policy, we generate two sets of rollouts: one with the initial state drawn from the dataset D (in-sample rollouts), and another with the initial state not in D (OOS rollouts). [...] There are between 200 and 300 demonstrations per task, but we typically restrict the learning to a dominant subset (between 80-90 percent) of these demonstrations to train SCDS and leave the rest to the evaluation stage.
Hardware Specification Yes For our experiments, we relied on a computational server running on the Linux operating system, specifically Ubuntu 24.04.1 LTS. The server was equipped with a high-performance NVIDIA RTX 4090 GPU with CUDA drivers version 12.6, which are more efficient when it comes to optimizing a loss on several rollouts generated in parallel. Moreover, an array of CPUs, Intel Core i9-9900K, featuring 8 cores and 16 threads each, and 64 GB of DDR4 RAM (2x32 GB), were used in conjunction with the single GPU pipeline.
Software Dependencies Yes For our experiments, we relied on a computational server running on the Linux operating system, specifically Ubuntu 24.04.1 LTS. The server was equipped with a high-performance NVIDIA RTX 4090 GPU with CUDA drivers version 12.6, which are more efficient when it comes to optimizing a loss on several rollouts generated in parallel. [...] Note that the entirety of SCDS s codebase, including the following modules, are efficiently implemented in Py Torch (Paszke et al., 2019).
Experiment Setup Yes The specific hyperparameters used for the LASA and Robomimic datasets are detailed in Tab. 7. [...] Table 7: Hyperparameters of SCDS and their optimal values or ranges are shown for every key model parameter. Param Description Optimal value(s) γ Lower bound on the contraction rate of the policy [1.0, 18.6], learnable K Number of coupling layer to increase the output nonlinearity 4 to 10 Nz Dimension of the latent state space 32 to 64 H Forward simulation horizon for the generated trajectory 20 to 50 K Number of invertible layers in the output map (also used for hidden blocks) 4, 8 β Complexity-accuracy trade-off parameter for soft-DTW (γ in some references) 0.1