reproducibilityindex.ai

Neural collapse vs. low-rank bias: Is deep neural collapse really optimal?

Authors: Peter Súkeník, Christoph H. Lampert, Marco Mondelli

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We support our theoretical findings with experiments on both DUFM and real data, which show the emergence of the low-rank structure in the solution found by gradient descent. ... 6 Numerical results
Researcher Affiliation	Academia	Peter Súkeník Institute of Science and Technology Austria 3400 Klosterneuburg, Austria peter.sukenik@ista.ac.at Christoph Lampert Institute of Science and Technology Austria 3400 Klosterneuburg, Austria chl@ista.ac.at Marco Mondelli Institute of Science and Technology Austria 3400 Klosterneuburg, Austria marco.mondelli@ista.ac.at
Pseudocode	No	The paper describes mathematical constructions and procedures in prose, such as Definition 4 detailing the 'strongly regular graph (SRG) solution', but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The source code will be provided on request.
Open Datasets	Yes	We support our theoretical results with empirical findings in three regimes: ... training on standard datasets (MNIST [31], CIFAR-10 [28]) with DUFM-like regularization...
Dataset Splits	No	The paper states it uses standard datasets (MNIST, CIFAR-10) which have predefined splits, but it does not explicitly provide the training, validation, and test dataset splits or percentages used for the experiments. The NeurIPS checklist also indicates that not all experimental details are fully provided.
Hardware Specification	No	The NeurIPS Paper Checklist explicitly states that the paper does not provide sufficient information on computer resources: 'The experiments do not require any specific hardware setup.'
Software Dependencies	No	The paper mentions the use of general frameworks and models such as 'Res Net20' and 'MLP head', but it does not provide specific version numbers for any software dependencies or libraries (e.g., PyTorch, TensorFlow, CUDA versions).
Experiment Setup	Yes	In the top row of Figure 2, we consider a 4-DUFM, with K = 10 and n = 50, presenting the training progression... λ = 0.004 for all regularization parameters, learning rate of 0.5 and width 30. ... We use weight decay 0.005 except λH1 = 0.000005 (to compensate for n = 5000, which significantly influences the total regularization strength), learning rate 0.05 and width 64 for all the MLP layers.