CCIL: Continuity-Based Data Augmentation for Corrective Imitation Learning

Authors: Liyiming Ke, Yunchu Zhang, Abhay Deshpande, Siddhartha Srinivasa, Abhishek Gupta

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate the efficacy of our generated labels, we conduct experiments across diverse robotics domains in simulation, encompassing classic control problems, drone flying, navigation with high-dimensional sensor observations, legged locomotion, and tabletop manipulation.
Researcher Affiliation Academia Paul G. Allen School of Computer Science and Engineering, University of Washington {kayke,yunchuz,abhayd,siddh,abhgupta}@cs.washington.edu
Pseudocode Yes Algorithm 1 CCIL: Continuity-based data augmentation for Corrective labels for Imitation Learning
Open Source Code No We will also open-source the code and the configuration we use for each experiment, once the proposal is published.
Open Datasets Yes For all other environments, we use the expert data from the D4RL dataset Fu et al. (2020).
Dataset Splits No The paper mentions using a "validation set" in the context of selecting the best dynamics model during its training phase, stating "We denote the empirical prediction error on the validation set for each trained dynamics model as .". However, it does not provide specific split percentages or sample counts for training, validation, and test datasets for the main imitation learning tasks, which are necessary to reproduce the data partitioning.
Hardware Specification No The paper mentions various simulation environments like F1tenth, gym-pybullet-drone, MuJoCo locomotion suite, and Meta World manipulation suites. While these imply computational resources were used, there are no specific details provided about the hardware (e.g., CPU models, GPU models, memory, or cloud instance types) used to run these simulations or train the models.
Software Dependencies No The paper mentions using "gym-pybullet-drone (Panerati et al., 2021)" as an open-source quadcopter simulator and refers to standard libraries like "neural network" and "MLPs". However, it does not provide specific version numbers for any of the software components, such as the simulator, Python, PyTorch/TensorFlow, or other libraries used for implementation, which are necessary for reproducible description.
Experiment Setup Yes After generating corrective labels, we use two-layer MLPs (64,64) plus Re Lu activation to train a Behavior Cloning agent with both original and augmented data. ... To enforce local Lipschitz continuity, we then adopt the samplingbased penalty and train a series of dynamics models by sweeping parameters, L = [2, 3, 5, 10], soft dynamics λ = [0.3, 0.5] and σ = [0.0001, 0.0003, 0.0005].