Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Demystifying Spectral Feature Learning for Instrumental Variable Regression

Authors: Dimitri Meunier, Antoine Moulin, Jakub Wornbard, Vladimir Kostic, Arthur Gretton

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Experiments We evaluate the main theoretical insight of the paper: Sieve 2SLS with spectral features performs best when h0 aligns with the top eigenspaces of T and its singular values decay slowly. Performance degrades with faster decay and weaker alignment. We design a synthetic NPIV setting where we control the conditional expectation operator T and its spectral decay and well as the alignment of h0 with its eigenspaces. To simulate such a setting, we rely on the following procedure for generating samples from the NPIV model:
Researcher Affiliation	Academia	Dimitri Meunier Gatsby Unit, UCL Antoine Moulin Universitat Pompeu Fabra Jakub Wornbard Gatsby Unit, UCL Vladimir R. Kostic Istituto Italiano di Tecnologia & University of Novi Sad Arthur Gretton Gatsby Unit, UCL
Pseudocode	No	The paper describes methods and procedures in text and through mathematical equations, but it does not present any structured pseudocode or algorithm blocks with specific labels like "Algorithm" or "Pseudocode".
Open Source Code	Yes	5. Open access to data and code Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We provide our code in a zip file.
Open Datasets	Yes	We apply this method to the d Sprites dataset, demonstrating its utility. ... [30] Loic Matthey, Irina Higgins, Demis Hassabis, and Alexander Lerchner. dsprites: Disentanglement testing sprites dataset. https://github.com/deepmind/dsprites-dataset/, 2017.
Dataset Splits	Yes	Data Splitting and Empirical Expectations. Given n, m ≥ 0, we consider two independent datasets: an unlabeled dataset, Dm = {( zi, xi)}m i=1, used to learn features for X and Z, and a labeled dataset, Dn = {(zi, xi, yi)}n i=1, used to estimate the structural function. ... The spectral features were trained on 100,000 samples of (Z, X), while the 2SLS estimator built from the learned features used a separate dataset of 10,000 samples of (Z, X, Y ).
Hardware Specification	No	8. Experiments compute resources Question: For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [Yes] Justification: Our experiments are lightweight and can be reproduced on a standard laptop without specialized hardware.
Software Dependencies	No	The paper mentions using "neural networks" and "stochastic gradient descent," but does not specify any particular software libraries or frameworks (e.g., PyTorch, TensorFlow) along with their version numbers. While it references an activation function from [44], this does not constitute a software dependency with a version.
Experiment Setup	Yes	E.1 Models Employed The features were learned using two-hidden-layer neural networks. All models shared the same architecture, with layer widths [1, 50, 50, 50]: the input is one-dimensional, and the final layer outputs 50 learned features. To encourage the models to learn more oscillatory functions, the first layer used the activation x → x + sin2(x), as introduced by [44], followed by GELU-activated hidden layers and a final linear layer. We note that the first singular eigenfunctions of the conditional expectation operator T are always the constant functions (see Section 2). Therefore, we hard-code the constant feature 1Z ⊗ 1X into the model and restrict the following learned features to be mean-zero. In addition, we included a regularization term to penalize both feature collinearity and large feature norms.