Towards Robust Bisimulation Metric Learning
Authors: Mete Kemertas, Tristan Aumentado-Armstrong
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we seek answers to the following questions concerning our main hypotheses: 1. Do the embedding collapse and explosion issues predicted theoretically occur in practice? 2. Do our contributions address these problems? 3. Does our proposed approach preserve the noise-invariance property of bisimulation? 4. How do our proposed improvements interact with each other? 5. How does our method perform compared to prior work, particularly with sparse rewards? To that end, we experiment on several altered classic control tasks from Open AI Gym [7] by (i) sparsifying the reward signal, and (ii) augmenting the environment state with noisy dimensions, to simulate distractions. We also perform larger scale experiments on two challenging vision-based 3D robotics benchmarks from the Deep Mind Control Suite [45]. |
| Researcher Affiliation | Academia | Mete Kemertas Department of Computer Science University of Toronto kemertas@cs.toronto.edu Tristan Aumentado-Armstrong Department of Computer Science University of Toronto taumen@cs.toronto.edu |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for their methodology. |
| Open Datasets | Yes | To that end, we experiment on several altered classic control tasks from Open AI Gym [7] by (i) sparsifying the reward signal, and (ii) augmenting the environment state with noisy dimensions, to simulate distractions. We also perform larger scale experiments on two challenging vision-based 3D robotics benchmarks from the Deep Mind Control Suite [45]. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits (e.g., percentages or sample counts). It refers to 'Training Steps' in figures but no explicit split information. |
| Hardware Specification | No | The paper does not mention any specific hardware specifications (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'Open AI Gym' and 'Deep Mind Control Suite' but does not specify version numbers for these or any other software dependencies. |
| Experiment Setup | No | The paper states 'One shortcoming of our approach is the lack of a principled way to set hyper-parameters for IR and ID, which was done empirically.' and mentions 'learning rate' in Section 3.1. However, it does not provide concrete hyperparameter values, training configurations, or other system-level settings for reproducibility in the main text. |