Asymptotically Free Sketched Ridge Ensembles: Risks, Cross-Validation, and Tuning
Authors: Pratik Patil, Daniel LeJeune
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our theoretical results using both synthetic and real large-scale datasets with practical sketches including Count Sketch and subsampled randomized discrete cosine transforms. |
| Researcher Affiliation | Academia | Pratik Patil Department of Statistics University of California, Berkeley California, CA 94720, USA pratikpatil@berkeley.edu Daniel Le Jeune Department of Statistics Stanford University California, CA 94305, USA daniel@dlej.net |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block was found in the paper. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | For RCV1 (Lewis et al., 2004), we downloaded the data from scikit-learn (Buitinck et al., 2013). ... For RNA-Seq (Weinstein et al., 2013), we downloaded the data from the UCI Machine Learning repository (Dua & Graff, 2017) at: https://archive.ics.uci.edu/ml/datasets/gene+expression+cancer+RNASeq. |
| Dataset Splits | Yes | RCV1... We then randomly subsampled 20000 training points and 5000 test points... RNA-Seq... leaving 446 observations, which were split into a training set of 356 and test set of 90. |
| Hardware Specification | Yes | All experiments were run in less than 1 hour on a Macbook Air (M1, 2020) and coded in Python using standard scientific computing packages. |
| Software Dependencies | No | The paper mentions 'Python' and 'standard scientific computing packages' and 'scikit-learn' but does not provide specific version numbers for these software dependencies, which are crucial for reproducibility. |
| Experiment Setup | Yes | For the left plot, we fix q 441, which is an allowed sketch size for Count Sketch. For the right plot, we fix λ 0.2 and sweep through the choices of q which are allowed by Count Sketch, which are q P t63, 126, 189, 252, 315, 378, 441, 504, 567u. ... Our sketching ensembles have K 5. |