Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Error Estimation for Sketched SVD via the Bootstrap
Authors: Miles Lopes, N. Benjamin Erichson, Michael Mahoney
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present a collection of synthetic and natural examples that demonstrate the practical performance of Algorithm 1. |
| Researcher Affiliation | Academia | Miles E. Lopes 1 N. Benjamin Erichson 2 Michael W. Mahoney 2 1Department of Statistics, UC Davis 2ICSI and Department of Statistics, UC Berkeley. Correspondence to: Miles E. Lopes <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 (Bootstrap estimation of sketching error). |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | Further, we would like to acknowledge the NOAA for providing the SST data (https://www.esrl.noaa.gov/psd/)." and "(Reynolds et al., 2007). |
| Dataset Splits | No | The paper does not provide specific details on train/validation/test dataset splits. For synthetic examples, data is generated directly, and for application examples, the entire dataset is used for evaluation of the sketched SVD without explicit splits mentioned. |
| Hardware Specification | No | The paper mentions that Algorithm 1 was distributed across '30 machines' and processed matrices 'on the order of 100GB' but does not specify exact hardware components like CPU/GPU models, memory, or specific machine configurations. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | For each choice of the sketch size t in a grid ranging from 500 up to 6000, we generated 500 in dependent sketching matrices S Rt n, which yielded 500 realizations of A Rt d . Here, we used squared-length sampling (Frieze et al., 2004) to construct the sketch A in each trial... Algorithm 1 was applied... using a choice of B = 30 in every instance. ... α will always be set to 0.05. |