Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
BM$^2$: Coupled Schrödinger Bridge Matching
Authors: Stefano Peluchetti
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments are presented in Section 5, followed by a discussion of related works in Section 6. Section 7 concludes the paper. For clarity, a more general formulation of BM2 is deferred to Appendix A, all proofs to Appendix B, an additional numerical experiment to Appendix C, and code listings to Appendix D. ... 5 Numerical Experiments To evaluate the performance of BM2 on EOT problems, we utilize the benchmark developed by Gushchin et al. (2023). ... Results for evaluation metrics (20) and (21) are summarized in Table 1 and Table 2, respectively. |
| Researcher Affiliation | Industry | Stefano Peluchetti EMAIL Sakana AI |
| Pseudocode | Yes | Finally, the implementation is straightforward (i, iv), as illustrated in Algorithms 1 and 2 and in the annotated Py Torch code of Listing 1. ... Algorithm 1 BM2 training loss computation ... Algorithm 2 BM2 training loop ... Listing 1: Basic implementation of BM2 loss computation (Algorithm 1) in Py Torch. |
| Open Source Code | Yes | Finally, the implementation is straightforward (i, iv), as illustrated in Algorithms 1 and 2 and in the annotated Py Torch code of Listing 1. ... D Python Code |
| Open Datasets | Yes | To evaluate the performance of BM2 on EOT problems, we utilize the benchmark developed by Gushchin et al. (2023). For the reference process (R), this benchmark provides pairs of target distributions Ψ0, Ψ1 with analytical EOT solution S0,1 and analytical SB-optimal drift function µs. |
| Dataset Splits | No | The paper does not explicitly state specific training, validation, or test dataset splits. It mentions using a 'benchmark developed by Gushchin et al. (2023)' and states, 'We use 1, 000 Monte Carlo samples to estimate (20, 21). Each method undergoes 50, 000 SGD training steps with a batch size of 1, 000'. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch code of Listing 1' and 'Adam W optimizer' but does not specify version numbers for PyTorch, Python, or other software libraries or frameworks used in the implementation or experiments. |
| Experiment Setup | Yes | Each method undergoes 50, 000 SGD training steps with a batch size of 1, 000, settings similar to those used by Gushchin et al. (2023), enabling qualitative comparison of our results with theirs. We use the Adam W optimizer with a learning rate of 10^-4 and hyperparameters: β = (0.9, 0.999), ϵ = 10^-8, wd = 0.01, where wd denotes weight decay. Time is sampled as t U(ϵ, 1 ϵ) for ϵ = 0.0025. For BM2, we employ a single feedforward neural network with 3 layers of width 768 and Re LU activation, resulting in approximately 1 million parameters. |