Memory-Consistent Neural Networks for Imitation Learning

Authors: Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman, James Weimer, Insup Lee

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using MCNNs on 10 imitation learning tasks, with MLP, Transformer, and Diffusion backbones, spanning dexterous robotic manipulation and driving, proprioceptive inputs and visual inputs, and varying sizes and types of demonstration data, we find large and consistent gains in performance, validating that MCNNs are better-suited than vanilla deep neural networks for imitation learning applications.
Researcher Affiliation Academia 1University of Pennsylvania, 2Vanderbilt University {ksridhar, duttaso, dineshj, weimerj, lee}@seas.upenn.edu
Pseudocode Yes Algorithm 1 Learning Memories; Algorithm 2 Behavior Cloning with Memory-Consistent Neural Networks: Training
Open Source Code Yes To ensure reproducibility, we have released all code for MCNN variants and the baselines at our website https://sites.google.com/view/ mcnn-imitation.
Open Datasets Yes Demonstration datasets are drawn from D4RL [9] for Adroit and CARLA and from the multimodal relay policy learning dataset [11] for Franka Kitchen.
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for a validation set needed to reproduce the data partitioning. It mentions evaluation on '20 trajectories' which seems to be the test set, and hyperparameter tuning is briefly mentioned as 'online' in Figure 7 but without specific split details for reproducibility.
Hardware Specification Yes We ran the experiments on either two Nvidia Ge Force RTX 3090 GPUs (each with 24 GB of memory) or two Nvidia Quadro RTX 6000 GPUs (each with 24 GB of memory). The CPUs used were Intel Xeon Gold processors @ 3 GHz.
Software Dependencies No The paper mentions using 'official implementation' and various libraries, but it does not specify software versions (e.g., Python 3.8, PyTorch 1.9) needed to replicate the experiment.
Experiment Setup Yes In BC and MCNN+MLP, we use an MLP with two hidden layers (three total layers) of size [256, 256] for Adroit tasks and [1024, 1024] for CARLA. We use an Adam optimizer with a starting learning rate of 3e-4 and train for 1 million steps. We simply minimize the mean squared error for training the policies. We use a batch size of 256 throughout.