Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently
Authors: Sergio Calo, Anders Jonsson, Gergely Neu, Ludovic Schwartz, Javier Segovia-Aguas
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have conducted a range of experiments on some simple environments with the purpose of illustrating the numerical properties of our algorithms and some aspects of the distance metrics we studied. |
| Researcher Affiliation | Academia | Sergio Calo Anders Jonsson Gergely Neu Ludovic Schwartz Javier Segovia-Aguas Universitat Pompeu Fabra, Barcelona, Spain {sergio.calo,anders.jonsson,gergely.neu,ludovic.schwartz,javier.segovia}@upf.edu |
| Pseudocode | Yes | We call the resulting method Sinkhorn Value Iteration (SVI), and provide its pseudocode as Algorithm 1. |
| Open Source Code | Yes | The code is available at https://github.com/SergioCalo/SVI |
| Open Datasets | No | For this experiment, we use the classic 4-rooms environment first studied by Sutton et al. [1999]. While the environment is classic and cited, the paper does not provide concrete access information (link, DOI, specific citation with authors/year for a dataset) for a publicly available or open dataset used in the experiments. |
| Dataset Splits | No | The paper conducts experiments on environments and randomly generated instances but does not specify training, validation, or test data splits. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for its experiments, such as specific GPU/CPU models or memory. |
| Software Dependencies | No | The comparison below is based on the original Python implementation3 of Brugère et al. [2024] and our own Python adaptation of the MATLAB code4 of O Connor et al. [2022]. The paper mentions programming languages and uses of other implementations but does not provide specific version numbers for software dependencies. |
| Experiment Setup | Yes | For this experiment, we use the classic 4-rooms environment first studied by Sutton et al. [1999], and run both SVI (Algorithm 1) and SPI (Algorithm 2) for a range of different choices of m, and a fixed γ = 0.95. The results of this study for K = 2 · 104 iterations are shown in Figure 3. |