Bisimulation Metrics are Optimal Transport Distances, and Can be Computed Efficiently

Authors: Sergio Calo, Anders Jonsson, Gergely Neu, Ludovic Schwartz, Javier Segovia-Aguas

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have conducted a range of experiments on some simple environments with the purpose of illustrating the numerical properties of our algorithms and some aspects of the distance metrics we studied.
Researcher Affiliation Academia Sergio Calo Anders Jonsson Gergely Neu Ludovic Schwartz Javier Segovia-Aguas Universitat Pompeu Fabra, Barcelona, Spain {sergio.calo,anders.jonsson,gergely.neu,ludovic.schwartz,javier.segovia}@upf.edu
Pseudocode Yes We call the resulting method Sinkhorn Value Iteration (SVI), and provide its pseudocode as Algorithm 1.
Open Source Code Yes The code is available at https://github.com/SergioCalo/SVI
Open Datasets No For this experiment, we use the classic 4-rooms environment first studied by Sutton et al. [1999]. While the environment is classic and cited, the paper does not provide concrete access information (link, DOI, specific citation with authors/year for a dataset) for a publicly available or open dataset used in the experiments.
Dataset Splits No The paper conducts experiments on environments and randomly generated instances but does not specify training, validation, or test data splits.
Hardware Specification No The paper does not explicitly describe the hardware used for its experiments, such as specific GPU/CPU models or memory.
Software Dependencies No The comparison below is based on the original Python implementation3 of Brugère et al. [2024] and our own Python adaptation of the MATLAB code4 of O Connor et al. [2022]. The paper mentions programming languages and uses of other implementations but does not provide specific version numbers for software dependencies.
Experiment Setup Yes For this experiment, we use the classic 4-rooms environment first studied by Sutton et al. [1999], and run both SVI (Algorithm 1) and SPI (Algorithm 2) for a range of different choices of m, and a fixed γ = 0.95. The results of this study for K = 2 · 104 iterations are shown in Figure 3.