Memorization-Dilation: Modeling Neural Collapse Under Noise
Authors: Duc Anh Nguyen, Ron Levie, Julian Lienen, Eyke Hüllermeier, Gitta Kutyniok
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evidence suggests that the memorization of noisy data points leads to a degradation (dilation) of the neural collapse. This is further confirmed empirically. To this end, we trained simple multi-layer neural networks for two classes (N = 2), which we subsampled from the image classification datasets MNIST Le Cun et al. (1998), Fashion MNIST Xiao et al. (2017), CIFAR-10 Krizhevsky & Hinton (2009) and SVHN Netzer et al. (2011). |
| Researcher Affiliation | Academia | Duc Anh Nguyen LMU Munich Ron Levie Technion Israel Institute of Technology Julian Lienen Paderborn University Gitta Kutyniok LMU Munich University of Tromsø Eyke Hüllermeier LMU Munich |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any specific repository link or explicit statement about the release of source code for the methodology described. |
| Open Datasets | Yes | To this end, we trained simple multi-layer neural networks for two classes (N = 2), which we subsampled from the image classification datasets MNIST Le Cun et al. (1998), Fashion MNIST Xiao et al. (2017), CIFAR-10 Krizhevsky & Hinton (2009) and SVHN Netzer et al. (2011). |
| Dataset Splits | No | The paper mentions training data and test instances, but does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and test sets. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions using certain loss functions and activation functions, but does not provide specific ancillary software details (e.g., library or solver names with version numbers like PyTorch 1.9 or Python 3.8). |
| Experiment Setup | Yes | The network consists of 9 hidden layers with 2048 neurons each... The feature dimension M is set to the number of classes N. We trained these networks using the CE and LS loss with a smoothing factor α = 0.1... The networks were trained until convergence in 200 epochs... using SGD with an initial learning rate of 0.1 multiplied by 0.1 each 40 epochs and a small weight decay of 0.001. Moreover, we considered Re LU as activation function throughout the network, as well as batch normalization in each hidden layer. A linear softmax classifier is composed on the encoder. We conducted each experiment ten times with different seeds. |