Memory-Consistent Neural Networks for Imitation Learning
Authors: Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman, James Weimer, Insup Lee
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using MCNNs on 10 imitation learning tasks, with MLP, Transformer, and Diffusion backbones, spanning dexterous robotic manipulation and driving, proprioceptive inputs and visual inputs, and varying sizes and types of demonstration data, we find large and consistent gains in performance, validating that MCNNs are better-suited than vanilla deep neural networks for imitation learning applications. |
| Researcher Affiliation | Academia | 1University of Pennsylvania, 2Vanderbilt University {ksridhar, duttaso, dineshj, weimerj, lee}@seas.upenn.edu |
| Pseudocode | Yes | Algorithm 1 Learning Memories; Algorithm 2 Behavior Cloning with Memory-Consistent Neural Networks: Training |
| Open Source Code | Yes | To ensure reproducibility, we have released all code for MCNN variants and the baselines at our website https://sites.google.com/view/ mcnn-imitation. |
| Open Datasets | Yes | Demonstration datasets are drawn from D4RL [9] for Adroit and CARLA and from the multimodal relay policy learning dataset [11] for Franka Kitchen. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for a validation set needed to reproduce the data partitioning. It mentions evaluation on '20 trajectories' which seems to be the test set, and hyperparameter tuning is briefly mentioned as 'online' in Figure 7 but without specific split details for reproducibility. |
| Hardware Specification | Yes | We ran the experiments on either two Nvidia Ge Force RTX 3090 GPUs (each with 24 GB of memory) or two Nvidia Quadro RTX 6000 GPUs (each with 24 GB of memory). The CPUs used were Intel Xeon Gold processors @ 3 GHz. |
| Software Dependencies | No | The paper mentions using 'official implementation' and various libraries, but it does not specify software versions (e.g., Python 3.8, PyTorch 1.9) needed to replicate the experiment. |
| Experiment Setup | Yes | In BC and MCNN+MLP, we use an MLP with two hidden layers (three total layers) of size [256, 256] for Adroit tasks and [1024, 1024] for CARLA. We use an Adam optimizer with a starting learning rate of 3e-4 and train for 1 million steps. We simply minimize the mean squared error for training the policies. We use a batch size of 256 throughout. |