Restoring balance: principled under/oversampling of data for optimal classification
Authors: Emanuele Loffredo, Mauro Pastore, Simona Cocco, Remi Monasson
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through numerical experiments, we show the relevance of our theoretical predictions on real datasets, on deeper architectures and with sampling strategies based on unsupervised probabilistic models. |
| Researcher Affiliation | Academia | 1Laboratoire de physique de l École normale supérieure, CNRS-UMR8023, PSL University, Sorbonne University, Université Paris-Cité 24 rue Lhomond, 75005 Paris, France. |
| Pseudocode | No | The paper describes methods in prose but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code to reproduce the theoretical curves can be found on github. |
| Open Datasets | Yes | We validated our predictions on (i) the Parity MNIST (p MNIST) dataset...; (ii) Fashion MNIST (FMNIST) with classes containing "Pullover" and "Shirt" images; and (iii) Celeb A with classes containing faces with "Straight hair" and "Wavy hair". |
| Dataset Splits | No | The paper mentions 'test set is balanced and has size 1000' for several datasets, but does not provide specific training/validation split percentages or sample counts for reproduction, nor does it cite predefined splits for these datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only mentioning 'deeper architectures' generally. |
| Software Dependencies | No | The paper mentions 'Scipy library (Jones et al., 2001)' and 'Scikit-learn package (Pedregosa et al., 2011)' but does not specify their version numbers. |
| Experiment Setup | Yes | We use RMSprop optimizer with a learning rate of 10 5 and decay of 10 5 and train for 100 epochs with a batch-size of 128. |