Leveraged volume sampling for linear regression

Authors: Michal Derezinski, Manfred K. K. Warmuth, Daniel J. Hsu

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Figure 1: Plots of the total loss for the sampling methods (averaged over 100 runs) versus sample size (shading is standard error) for the libsvm dataset cpusmall [9]. Experiments. Figure 1 presents experimental evidence on a benchmark dataset (cpusmall from the libsvm collection [9]) that the potential bad behavior of volume sampling proven in our lower bound does occur in practice. Appendix E shows more datasets and a detailed discussion of the experiments.
Researcher Affiliation Academia MichaƂ Derezi nski and Manfred K. Warmuth Department of Computer Science University of California, Santa Cruz mderezin@berkeley.edu, manfred@ucsc.edu Daniel Hsu Computer Science Department Columbia University, New York djhsu@cs.columbia.edu
Pseudocode Yes Reverse iterative sampling Volume Sample(X, k): S [n] while |S| > k S XS) Sample i / qi out of S S S\{i} end return S Determinantal rejection sampling 1: Input: X2Rn d, q = ( l1 d , . . . , ln d ), k d 2: s max{k, 4d2} 3: repeat 4: Sample 1, . . . , s i.i.d. (q1, . . . , qn) 5: Sample Accept Bernoulli s X>Q X) det(X>X) 6: until Accept = true 7: S Volume Sample 1/2 [1..n]X) , k 8: return S
Open Source Code No The paper references LIBSVM as a tool used for experiments, providing its availability link in the bibliography [9], but it does not state that the authors' own implementation code for their methodology is open-source or provide a link for it.
Open Datasets Yes Figure 1 presents experimental evidence on a benchmark dataset (cpusmall from the libsvm collection [9]). [9] Chih-Chung Chang and Chih-Jen Lin. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2:27:1 27:27, 2011. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
Dataset Splits No The paper mentions using the 'cpusmall' dataset but does not provide specific details on how it was split into training, validation, or testing sets, such as percentages or sample counts.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to conduct the experiments.
Software Dependencies No The paper mentions 'libsvm' as a collection and software used but does not specify the version numbers of any software dependencies used in their experimental setup, such as specific programming languages, libraries, or frameworks.
Experiment Setup No The paper presents experimental results in Figure 1, but it does not provide concrete details about the experimental setup, such as hyperparameters (e.g., learning rates, batch sizes), optimization settings, or other training configurations.