Towards Robust Bisimulation Metric Learning

Authors: Mete Kemertas, Tristan Aumentado-Armstrong

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we seek answers to the following questions concerning our main hypotheses: 1. Do the embedding collapse and explosion issues predicted theoretically occur in practice? 2. Do our contributions address these problems? 3. Does our proposed approach preserve the noise-invariance property of bisimulation? 4. How do our proposed improvements interact with each other? 5. How does our method perform compared to prior work, particularly with sparse rewards? To that end, we experiment on several altered classic control tasks from Open AI Gym [7] by (i) sparsifying the reward signal, and (ii) augmenting the environment state with noisy dimensions, to simulate distractions. We also perform larger scale experiments on two challenging vision-based 3D robotics benchmarks from the Deep Mind Control Suite [45].
Researcher Affiliation Academia Mete Kemertas Department of Computer Science University of Toronto kemertas@cs.toronto.edu Tristan Aumentado-Armstrong Department of Computer Science University of Toronto taumen@cs.toronto.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository for their methodology.
Open Datasets Yes To that end, we experiment on several altered classic control tasks from Open AI Gym [7] by (i) sparsifying the reward signal, and (ii) augmenting the environment state with noisy dimensions, to simulate distractions. We also perform larger scale experiments on two challenging vision-based 3D robotics benchmarks from the Deep Mind Control Suite [45].
Dataset Splits No The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages or sample counts). It refers to 'Training Steps' in figures but no explicit split information.
Hardware Specification No The paper does not mention any specific hardware specifications (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'Open AI Gym' and 'Deep Mind Control Suite' but does not specify version numbers for these or any other software dependencies.
Experiment Setup No The paper states 'One shortcoming of our approach is the lack of a principled way to set hyper-parameters for IR and ID, which was done empirically.' and mentions 'learning rate' in Section 3.1. However, it does not provide concrete hyperparameter values, training configurations, or other system-level settings for reproducibility in the main text.