Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Distances for Markov chains from sample streams

Authors: Sergio Calo, Anders Jonsson, Gergely Neu, Ludovic Schwartz, Javier Segovia-Aguas

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide theoretical guarantees on the sample complexity of the algorithm and validate its effectiveness through a series of empirical evaluations.
Researcher Affiliation Academia Sergio Calo Anders Jonsson Gergely Neu Ludovic Schwartz Javier Segovia-Aguas Universitat Pompeu Fabra, Barcelona, Spain EMAIL
Pseudocode Yes Algorithm 1 SOMCOT Input: c, η, β, γ, K Initialize: µ1 = U(XYXY), λX ( |x) = U(Y) for all x, λY( |y) = U(X) for all y, α = 0, V = 0. For k = 1, 2, . . . , K: Sample Xk, X k νX and Yk, Y k νY, compute gradient estimators via Eqs. (10) (15), update primal parameters via Eqs. (16) (18), update dual parameters via Eqs. (19) (21). Output: µK = 1 K PK k=1 µk. Algorithm 2 Stochastic Optimization for Markov Chain Optimal Transport (SOMCOT)
Open Source Code Yes Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes]
Open Datasets Yes To this end, we consider the classic control environment Pendulum-v1 from Gymnasium [Towers et al., 2024].
Dataset Splits No The paper uses sample streams from Markov chains and the Pendulum-v1 environment. While it mentions 'sample sizes 1000, 10000 and 100000' for evaluating model performance, it does not describe traditional training/testing/validation splits for a fixed dataset. The experiments focus on estimating distances from sample streams, not on partitioning a pre-existing dataset into explicit splits for supervised learning.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using the 'DDPG algorithm [Lillicrap et al., 2015]' and the 'Gymnasium [Towers et al., 2024]' environment. However, it does not specify version numbers for any software libraries, programming languages, or specific implementations of these algorithms.
Experiment Setup Yes We performed a suite of numerical experiments to study the empirical behavior of our newly proposed algorithm, as well as to illustrate some potential applications that are enabled by our method. Due to space restrictions, we only show a small portion of the results here, and refer the reader to Appendix F for additional results and implementation details (most notably a detailed discussion on hyperparameter-tuning). Table 1: Table summarizing our hyperparameter choices for each experiment. Recall that the learning rates follow the decaying scheme ηk = η0 1+ak, and the minibatch size is denoted by b.