Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Error Estimation for Sketched SVD via the Bootstrap

Authors: Miles Lopes, N. Benjamin Erichson, Michael Mahoney

ICML 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present a collection of synthetic and natural examples that demonstrate the practical performance of Algorithm 1.
Researcher Affiliation	Academia	Miles E. Lopes 1 N. Benjamin Erichson 2 Michael W. Mahoney 2 1Department of Statistics, UC Davis 2ICSI and Department of Statistics, UC Berkeley. Correspondence to: Miles E. Lopes <EMAIL>.
Pseudocode	Yes	Algorithm 1 (Bootstrap estimation of sketching error).
Open Source Code	No	The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets	Yes	Further, we would like to acknowledge the NOAA for providing the SST data (https://www.esrl.noaa.gov/psd/)." and "(Reynolds et al., 2007).
Dataset Splits	No	The paper does not provide specific details on train/validation/test dataset splits. For synthetic examples, data is generated directly, and for application examples, the entire dataset is used for evaluation of the sketched SVD without explicit splits mentioned.
Hardware Specification	No	The paper mentions that Algorithm 1 was distributed across '30 machines' and processed matrices 'on the order of 100GB' but does not specify exact hardware components like CPU/GPU models, memory, or specific machine configurations.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	For each choice of the sketch size t in a grid ranging from 500 up to 6000, we generated 500 in dependent sketching matrices S Rt n, which yielded 500 realizations of A Rt d . Here, we used squared-length sampling (Frieze et al., 2004) to construct the sketch A in each trial... Algorithm 1 was applied... using a choice of B = 30 in every instance. ... α will always be set to 0.05.