On the Spectrum of Random Features Maps of High Dimensional Data
Authors: Zhenyu Liao, Romain Couillet
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We complete this article by showing that our theoretical results, derived from Gaussian mixture models, show an unexpected close match in practice when applied to some real-world datasets. We consider two different types of classification tasks: one on handwritten digits of the popular MNIST (Le Cun et al., 1998) database (number 6 and 8), and the other on epileptic EEG time series data (Andrzejak et al., 2001) (set B and E). |
| Researcher Affiliation | Academia | 1Laboratoire des Signaux et Syst emes (L2S), Centrale Sup elec, Universit e Paris-Saclay, France; 2G-STATS Data Science Chair, GIPSA-lab, University Grenobles-Alpes, France. Correspondence to: Zhenyu Liao <zhenyu.liao@l2s.centralesupelec.fr>, Romain Couillet <romain.couillet@centralesupelec.fr>. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Python 3 codes to reproduce the results in this section are available at https://github.com/Zhenyu-LIAO/RMT4RFM. |
| Open Datasets | Yes | We consider two different types of classification tasks: one on handwritten digits of the popular MNIST (Le Cun et al., 1998) database (number 6 and 8), and the other on epileptic EEG time series data (Andrzejak et al., 2001) (set B and E). |
| Dataset Splits | No | The paper mentions 'randomly selected vectorized images' and 'randomly picked EEG segments' for constructing the Gram matrix and performing spectral clustering, but it does not provide specific details on train/validation/test dataset splits for model training. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Python 3 codes' but does not provide specific version numbers for Python itself or any other key software libraries or dependencies used in the experiments. |
| Experiment Setup | No | The paper discusses the dimensions of the data (p, T) and the number of random features (n) used in the analysis, and states expectations are 'estimated by averaging over 500 realizations of W' and accuracies are 'averaged over 50 runs', but it does not provide specific experimental setup details such as hyperparameters for model training (e.g., learning rates, batch sizes, or optimizer settings) or detailed configuration for the k-means algorithm. |