Learning Curves for Noisy Heterogeneous Feature-Subsampled Ridge Ensembles
Authors: Ben Ruben, Cengiz Pehlevan
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Figure 1a, we confirm the result of the general calculation by comparing with numerical experiments using a synthetic dataset with M = 2000 highly structured features (see caption for details). k = 3 readouts see random, fixed subsets of features. Theory curves are calculated by solving the fixed-point equations 10 numerically for the chosen Σs, Σ0 and {Ar}k r=1 then evaluating eq. 9. and We also test the effect of heterogeneous ensembling in the a realistic classification task. Specifically, we train ensembles of linear classifiers to predict the labels of imagenet [44] images corresponding to 10 different dog breeds (the Imagewoof task [45]) from their top-hidden-layer representations in a pre-trained Res Next deep network [46] (see Appendix E for details). |
| Researcher Affiliation | Academia | Benjamin S. Ruben1, Cengiz Pehlevan2,3,4 1Biophysics Graduate Program 2Center for Brain Science, 3John A. Paulson School of Engineering and Applied Sciences, 4Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University Cambridge, MA 02138 |
| Pseudocode | No | The paper describes methods and processes in narrative text, but it does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks, nor does it present structured steps formatted like code or an algorithm. |
| Open Source Code | Yes | All Code used in this paper has been made available online (see https://github.com/benruben87/Learning-Curves-for-Heterogeneous-Feature-Subsampled-Ridge Ensembles.git). |
| Open Datasets | Yes | Specifically, we train ensembles of linear classifiers to predict the labels of imagenet [44] images corresponding to 10 different dog breeds (the Imagewoof task [45]) and We construct two datasets using this method, using images from the Imagenette and Imagewoof datasets [45]. |
| Dataset Splits | No | The paper mentions training set sizes and test set sizes (e.g., 'For the Imagewoof task, we a construct a training set of size ntr = 9025 and a test set of size ntest = 3929'), but it does not provide specific details on how these splits were obtained (e.g., split percentages, random seed, or cross-validation folds) nor does it explicitly state the use of a validation set for hyperparameter tuning in its experimental setup. |
| Hardware Specification | No | The paper mentions 'The compute time required to do all of the calculations in this paper is approximately 3 GPU days', but it does not provide specific hardware details such as exact GPU models, CPU models, or memory specifications used for running its experiments. |
| Software Dependencies | Yes | Numerical experiments were performed using the Py Torch library [52]. and We wrote a Mathematica package which can handle multiplication, addition, and inversion of matrices... This package is included as supplemental material to this publication. with a citation '[54] Wolfram Research, Inc. Mathematica, Version 13.3. Champaign, IL, 2023.' |
| Experiment Setup | Yes | In Figure 1a, we confirm the result of the general calculation by comparing with numerical experiments using a synthetic dataset with M = 2000... λr = λ (see legend). and At small regularization (λ = .001), we find that heterogeneity of the distribution of subsampling fractions ( σ > 0) lowers the double-descent peak of an ensemble of linear predictors, while at larger regularization (λ = 0.1), there is little difference between homogeneous and heterogeneous learning curves. |