Learning from Demonstration with Weakly Supervised Disentanglement

Authors: Yordan Hristov, Subramanian Ramamoorthy

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our approach is evaluated in the context of two table-top robot manipulation tasks performed by a PR2 robot that of dabbing liquids with a sponge (forcefully pressing a sponge and moving it along a surface) and pouring between different containers. and 5 EXPERIMENTS and 6 RESULTS & DISCUSSION
Researcher Affiliation Academia Yordan Hristov School of Informatics University of Edinburgh yordan.hristov@ed.ac.uk Subramanian Ramamoorthy School of Informatics University of Edinburgh s.ramamoorthy@ed.ac.uk
Pseudocode Yes Algorithms 1 and 2 provide pseudo-code for the trajectory generation procedures described in Section 3.
Open Source Code No We have made videos of the tasks and data available see supplementary materials at: https://sites.google.com/view/weak-label-lfd. The paper mentions data availability but does not explicitly state that the source code for the methodology is released.
Open Datasets Yes We release a dataset of subjective concepts grounded in multi-modal demonstrations. and We have made videos of the tasks and data available see supplementary materials at: https://sites.google.com/view/weak-label-lfd.
Dataset Splits Yes The size of the total dataset after augmentation is 1000 demonstrations which are split according to a 90-10 training-validation split.
Hardware Specification No The paper mentions using a 'PR2 robot' and 'Kinect2' for data capture, but does not provide specific hardware details (e.g., GPU/CPU models, memory) used for model training or inference.
Software Dependencies No The models are implemented in Py Torch (Adam et al., 2017) and optimised using the Adam optimiser (Kingma & Ba, 2014). The paper names PyTorch but does not specify its version number.
Experiment Setup Yes Across all experiments, training is performed for a fixed number of 100 epochs using a batch size of 8. The dimensionality of the latent space |c| = 8 across all experiments. The Adam optimizer (Kingma & Ba, 2014) is used through the learning process with the following values for its parameters (learningrate = 0.001, β1 = 0.9, β2 = 0.999, eps = 1e 08, weightdecayrate = 0, amsgrad = False). For all experiments, the values (unless when set to 0) for the three coefficients from Equation 9 are: α = 1, β = 0.1, γ = 10.