Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Leveraged volume sampling for linear regression

Authors: Michal Derezinski, Manfred K. K. Warmuth, Daniel J. Hsu

NeurIPS 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Figure 1: Plots of the total loss for the sampling methods (averaged over 100 runs) versus sample size (shading is standard error) for the libsvm dataset cpusmall [9]. Experiments. Figure 1 presents experimental evidence on a benchmark dataset (cpusmall from the libsvm collection [9]) that the potential bad behavior of volume sampling proven in our lower bound does occur in practice. Appendix E shows more datasets and a detailed discussion of the experiments.
Researcher Affiliation	Academia	Michał Derezi nski and Manfred K. Warmuth Department of Computer Science University of California, Santa Cruz EMAIL, EMAIL Daniel Hsu Computer Science Department Columbia University, New York EMAIL
Pseudocode	Yes	Reverse iterative sampling Volume Sample(X, k): S [n] while \|S\| > k S XS) Sample i / qi out of S S S\{i} end return S Determinantal rejection sampling 1: Input: X2Rn d, q = ( l1 d , . . . , ln d ), k d 2: s max{k, 4d2} 3: repeat 4: Sample 1, . . . , s i.i.d. (q1, . . . , qn) 5: Sample Accept Bernoulli s X>Q X) det(X>X) 6: until Accept = true 7: S Volume Sample 1/2 [1..n]X) , k 8: return S
Open Source Code	No	The paper references LIBSVM as a tool used for experiments, providing its availability link in the bibliography [9], but it does not state that the authors' own implementation code for their methodology is open-source or provide a link for it.
Open Datasets	Yes	Figure 1 presents experimental evidence on a benchmark dataset (cpusmall from the libsvm collection [9]). [9] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1 27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Dataset Splits	No	The paper mentions using the 'cpusmall' dataset but does not provide specific details on how it was split into training, validation, or testing sets, such as percentages or sample counts.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to conduct the experiments.
Software Dependencies	No	The paper mentions 'libsvm' as a collection and software used but does not specify the version numbers of any software dependencies used in their experimental setup, such as specific programming languages, libraries, or frameworks.
Experiment Setup	No	The paper presents experimental results in Figure 1, but it does not provide concrete details about the experimental setup, such as hyperparameters (e.g., learning rates, batch sizes), optimization settings, or other training configurations.