On the interplay between data structure and loss function in classification problems

Authors: Stéphane d'Ascoli, Marylou Gabrié, Levent Sagun, Giulio Biroli

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our insights are confirmed by numerical experiments on MNIST and CIFAR10. We validate our insights via controlled experiments on the MNIST and CIFAR10 datasets described in the Section 5.
Researcher Affiliation Collaboration Stéphane d Ascoli Facebook AI Research, Paris Department of Physics, École Normale Supérieure, Paris stephane.dascoli@ens.fr Marylou Gabrié New York University, New York Flatiron Institute, New York mgabrie@nyu.edu Levent Sagun Facebook AI Research, Paris Giulio Biroli Department of Physics, École Normale Supérieure, Paris
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Reproducibility The code to reproduce our experiments is available at https://github.com/sdascoli/data-structure.
Open Datasets Yes Our insights are confirmed by numerical experiments on MNIST and CIFAR10. We validate our insights via controlled experiments on the MNIST and CIFAR10 datasets described in the Section 5. We consider two realistic binary classification tasks: parity of digits in the MNIST dataset and airplanes vs cars in the CIFAR10 dataset.
Dataset Splits No The paper states 'N/D = 1' and refers to the number of training examples (N), but does not specify explicit percentages or sample counts for training, validation, or test splits. It implies using standard datasets but does not detail how *their* data was partitioned.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper does not list specific version numbers for any software components or libraries used in the experiments.
Experiment Setup Yes We set σ = Tanh, λ = 10 3 and N/D = 1. (Figure 2 caption) We set σ = Tanh and λ = 10 4. (Figure 3 caption) We set σ = Tanh, λ = 10 3 and N/D = 1. (Figure 6 caption)