Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery
Authors: Amin Soleimani Abyaneh, Mahrokh Boroujeni, Hsiu-Chin Lin, Giancarlo Ferrari-Trecate
ICLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we demonstrate substantial OOS performance improvements for simulated robotic manipulation and navigation tasks. [...] We conduct empirical studies on the LASA dataset (Khansari-Zadeh & Billard, 2011), as well as the more complex Robomimic dataset (Mandlekar et al., 2021). [...] Table 1: Evaluating in-sample and out-of-sample rollouts error on the LASA and the Robomimic datasets. |
| Researcher Affiliation | Academia | Amin Abyaneh1 , Mahrokh G. Boroujeni2 , Hsiu-Chin Lin1, Giancarlo Ferrari-Trecate2 1 Mc Gill University,2 Ecole Polytechnique F ed erale de Lausanne (EPFL) |
| Pseudocode | Yes | The pseudocode for our method is presented in Alg. 1 |
| Open Source Code | Yes | See sites.google.com/view/contractive-dynamical-policies for our codebase and highlight of the results. |
| Open Datasets | Yes | We conduct empirical studies on the LASA dataset (Khansari-Zadeh & Billard, 2011), as well as the more complex Robomimic dataset (Mandlekar et al., 2021). These datasets are detailed in App. G. |
| Dataset Splits | Yes | After training each policy, we generate two sets of rollouts: one with the initial state drawn from the dataset D (in-sample rollouts), and another with the initial state not in D (OOS rollouts). [...] There are between 200 and 300 demonstrations per task, but we typically restrict the learning to a dominant subset (between 80-90 percent) of these demonstrations to train SCDS and leave the rest to the evaluation stage. |
| Hardware Specification | Yes | For our experiments, we relied on a computational server running on the Linux operating system, specifically Ubuntu 24.04.1 LTS. The server was equipped with a high-performance NVIDIA RTX 4090 GPU with CUDA drivers version 12.6, which are more efficient when it comes to optimizing a loss on several rollouts generated in parallel. Moreover, an array of CPUs, Intel Core i9-9900K, featuring 8 cores and 16 threads each, and 64 GB of DDR4 RAM (2x32 GB), were used in conjunction with the single GPU pipeline. |
| Software Dependencies | Yes | For our experiments, we relied on a computational server running on the Linux operating system, specifically Ubuntu 24.04.1 LTS. The server was equipped with a high-performance NVIDIA RTX 4090 GPU with CUDA drivers version 12.6, which are more efficient when it comes to optimizing a loss on several rollouts generated in parallel. [...] Note that the entirety of SCDS s codebase, including the following modules, are efficiently implemented in Py Torch (Paszke et al., 2019). |
| Experiment Setup | Yes | The specific hyperparameters used for the LASA and Robomimic datasets are detailed in Tab. 7. [...] Table 7: Hyperparameters of SCDS and their optimal values or ranges are shown for every key model parameter. Param Description Optimal value(s) γ Lower bound on the contraction rate of the policy [1.0, 18.6], learnable K Number of coupling layer to increase the output nonlinearity 4 to 10 Nz Dimension of the latent state space 32 to 64 H Forward simulation horizon for the generated trajectory 20 to 50 K Number of invertible layers in the output map (also used for hidden blocks) 4, 8 β Complexity-accuracy trade-off parameter for soft-DTW (γ in some references) 0.1 |