Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Spectral Estimation with Free Decompression

Authors: Siavash Ameli, Chris van der Heide, Liam Hodgkinson, Michael W. Mahoney

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of this approach through a series of examples, comparing its performance against known limiting distributions from random matrix theory in synthetic settings, as well as applying it to submatrices of real-world datasets, matching them with their full empirical eigenspectra. We now present numerical examples that demonstrate the utility of our free decompression method. To begin, we consider a number of synthetic examples for random matrices whose spectral densities and their corresponding Stieltjes transforms have known analytic expressions. We then consider two evaluations on real-world large-scale datasets: Facebook Page-Page network (Leskovec & Krevl, 2014; Rozemberczki et al., 2021); and the empirical Neural Tangent Kernel (NTK) corresponding to Res Net50 (He et al., 2016) trained on CIFAR10 (Krizhevsky, 2009).
Researcher Affiliation	Academia	Siavash Ameli ICSI and Department of Statistics University of California, Berkeley EMAIL Chris van der Heide Dept. of Electrical and Electronic Eng. University of Melbourne EMAIL Liam Hodgkinson School of Mathematics and Statistics University of Melbourne EMAIL Michael W. Mahoney ICSI, LBNL, and Department of Statistics University of California, Berkeley EMAIL
Pseudocode	Yes	Algorithm 1: Pseudocode for Free Decompression
Open Source Code	Yes	The source code of the Python package freealg is available at https://github.com/ameli/freealg, with documentation at https://ameli.github.io/freealg.
Open Datasets	Yes	SNAP Facebook Dataset: We use the publicly available Facebook Page Page network from SNAP (Leskovec & Krevl, 2014; Rozemberczki et al., 2021)...
Dataset Splits	Yes	We will work with submatrices indexed by randomly sampled columns of the impalpable matrix of interest... Subsamples are drawn from this reduced matrix. Specifically, we took ˆA to be the full 55,560 55,560 matrix (corresponding to 5556 images), and reduced this to a 50,000 50,000 matrix A after removing the null component. Submatrices of dimension 1024, 2048, 4096, 8192, 16,382, and 32,768 were then sampled... Values are the mean over 20 randomly sampled initial submatrices with standard deviations in parentheses.
Hardware Specification	Yes	All experiments were conducted on a consumer-grade device with an AMD Ryzen 7 5800X processor, NVIDIA RTX 3080, and 64GB RAM.
Software Dependencies	No	We developed a Python package, freealg, which implements our algorithms and enables reproduction of all numerical results in this paper.
Experiment Setup	Yes	Hyperparameters used for the numerical experiments appearing in this document can be found in Table H.1 (notation summarized in the caption). Marchenko Pastur Figure 1 Beta 3 10 3 ( 1 2) 50 0 Jackson (1, 1) 10 4