Asymptotically Free Sketched Ridge Ensembles: Risks, Cross-Validation, and Tuning

Authors: Pratik Patil, Daniel LeJeune

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically validate our theoretical results using both synthetic and real large-scale datasets with practical sketches including Count Sketch and subsampled randomized discrete cosine transforms.
Researcher Affiliation Academia Pratik Patil Department of Statistics University of California, Berkeley California, CA 94720, USA pratikpatil@berkeley.edu Daniel Le Jeune Department of Statistics Stanford University California, CA 94305, USA daniel@dlej.net
Pseudocode No No pseudocode or clearly labeled algorithm block was found in the paper.
Open Source Code No The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes For RCV1 (Lewis et al., 2004), we downloaded the data from scikit-learn (Buitinck et al., 2013). ... For RNA-Seq (Weinstein et al., 2013), we downloaded the data from the UCI Machine Learning repository (Dua & Graff, 2017) at: https://archive.ics.uci.edu/ml/datasets/gene+expression+cancer+RNASeq.
Dataset Splits Yes RCV1... We then randomly subsampled 20000 training points and 5000 test points... RNA-Seq... leaving 446 observations, which were split into a training set of 356 and test set of 90.
Hardware Specification Yes All experiments were run in less than 1 hour on a Macbook Air (M1, 2020) and coded in Python using standard scientific computing packages.
Software Dependencies No The paper mentions 'Python' and 'standard scientific computing packages' and 'scikit-learn' but does not provide specific version numbers for these software dependencies, which are crucial for reproducibility.
Experiment Setup Yes For the left plot, we fix q 441, which is an allowed sketch size for Count Sketch. For the right plot, we fix λ 0.2 and sweep through the choices of q which are allowed by Count Sketch, which are q P t63, 126, 189, 252, 315, 378, 441, 504, 567u. ... Our sketching ensembles have K 5.