CCIL: Continuity-Based Data Augmentation for Corrective Imitation Learning
Authors: Liyiming Ke, Yunchu Zhang, Abhay Deshpande, Siddhartha Srinivasa, Abhishek Gupta
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate the efficacy of our generated labels, we conduct experiments across diverse robotics domains in simulation, encompassing classic control problems, drone flying, navigation with high-dimensional sensor observations, legged locomotion, and tabletop manipulation. |
| Researcher Affiliation | Academia | Paul G. Allen School of Computer Science and Engineering, University of Washington {kayke,yunchuz,abhayd,siddh,abhgupta}@cs.washington.edu |
| Pseudocode | Yes | Algorithm 1 CCIL: Continuity-based data augmentation for Corrective labels for Imitation Learning |
| Open Source Code | No | We will also open-source the code and the configuration we use for each experiment, once the proposal is published. |
| Open Datasets | Yes | For all other environments, we use the expert data from the D4RL dataset Fu et al. (2020). |
| Dataset Splits | No | The paper mentions using a "validation set" in the context of selecting the best dynamics model during its training phase, stating "We denote the empirical prediction error on the validation set for each trained dynamics model as .". However, it does not provide specific split percentages or sample counts for training, validation, and test datasets for the main imitation learning tasks, which are necessary to reproduce the data partitioning. |
| Hardware Specification | No | The paper mentions various simulation environments like F1tenth, gym-pybullet-drone, MuJoCo locomotion suite, and Meta World manipulation suites. While these imply computational resources were used, there are no specific details provided about the hardware (e.g., CPU models, GPU models, memory, or cloud instance types) used to run these simulations or train the models. |
| Software Dependencies | No | The paper mentions using "gym-pybullet-drone (Panerati et al., 2021)" as an open-source quadcopter simulator and refers to standard libraries like "neural network" and "MLPs". However, it does not provide specific version numbers for any of the software components, such as the simulator, Python, PyTorch/TensorFlow, or other libraries used for implementation, which are necessary for reproducible description. |
| Experiment Setup | Yes | After generating corrective labels, we use two-layer MLPs (64,64) plus Re Lu activation to train a Behavior Cloning agent with both original and augmented data. ... To enforce local Lipschitz continuity, we then adopt the samplingbased penalty and train a series of dynamics models by sweeping parameters, L = [2, 3, 5, 10], soft dynamics λ = [0.3, 0.5] and σ = [0.0001, 0.0003, 0.0005]. |