Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Dimension reduction via score ratio matching

Authors: Ricardo Baptista, Michael Brennan, Youssef Marzouk

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We now present several numerical experiments to show the utility of our methods. Table 1 in the Supplementary Material summarizes all network and training hyperparameter choices for each numerical example. For the problems presented in Sections 6.1 and 6.2, the score ratio is tractable and thus we can construct the true diagnostic matrices HX CDR and HY CMI. We thus evaluate the accuracy of our method by comparing the posterior approximation errors for the optimal basis transformations in (4) and (6) with the following error bounds achieved using the learned bases e U and e V 3: ECDR r (e U) := 1 2 Tr((I e Ur e U r )HX CDR), ECMI s (e V ) := Tr((I e Vs e V s )HY CMI). We emphasize that this analysis is meant to validate our method and is only possible when the true diagnostic matrices are computable. For the problems of Sections 6.3 and 6.4, the true diagnostic matrix is not computable. In these cases we validate our method by showing we achieve better inference fidelity for the reduced problems as compared to the non-reduced problems.
Researcher Affiliation	Academia	Ricardo Baptista EMAIL California Institute of Technology Pasadena, CA 91125 USA Michael C. Brennan EMAIL Massachusetts Institute of Technology Cambridge, MA 02139 USA Youssef Marzouk EMAIL Massachusetts Institute of Technology Cambridge, MA 02139 USA
Pseudocode	Yes	Algorithm 1 Single network score ratio dimension reduction 1: Input: Target data {x(j), y(j)}N j=1 πX,Y , and user tolerances εX, εY > 0 2: Solve min θ,Wx,Wy J (θ, Wx, Wy) to obtain the score-ratio approximation wθ(x, y). 3: Estimate the diagnostic matrices b HX CDR = 1 j=1 wθ(x(j), y(j))wθ(x(j), y(j)) b HY CMI = 1 j=1 ywθ(x(j), y(j)) ywθ(x(j), y(j)) 4: Compute the eigenpairs: (λX i , eui) R 0 Rn of b HX CDR and (λY i , evi) R 0 Rm of b HY CMI. 5: Pick r and s so that 1 2 k>r λX k < εX, X k>s λY k < εY and set e Ur = [eu1 . . . eur], e Vs = [ev1 . . . evs]. 6: output: e Ur, e Vs
Open Source Code	No	The paper does not contain an explicit statement about the release of source code for the methodology described, nor does it provide a direct link to a code repository.
Open Datasets	No	The paper describes generating data for its numerical examples (e.g., 'embedded banana distribution where the data-generating process is given by...', 'Our training dataset contains N = 20000 energy price and observation realizations from January 2020 through February 2023.') but does not provide concrete access information (link, DOI, repository, or formal citation) for any publicly available datasets used in the experiments.
Dataset Splits	No	The paper mentions total sample sizes used for training and 'held-out samples' in its numerical examples (e.g., 'Both the score ratio and standard score networks were trained with N = 1000 samples.', 'fix the sample size to N = 90000', 'Our training dataset contains N = 20000 energy price and observation realizations from January 2020 through February 2023.'), but it does not provide specific details on the training, validation, and test dataset splits (percentages, exact counts for each split, or detailed splitting methodology).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions software like 'PyTorch' and 'nflows' (Durkan et al., 2020) and methods like 'Adam', but it does not provide specific version numbers for these or any other key software components, libraries, or solvers used in the experiments.
Experiment Setup	No	The paper states, 'Table 1 in the Supplementary Material summarizes all network and training hyperparameter choices for each numerical example.' This indicates that specific experimental setup details are provided in the supplementary material, not in the main text of the paper.