Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame

Authors: Evan Markou, Thalaiyasingam Ajanthan, Stephen Gould

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on synthetic and real-world architectures for classification tasks demonstrate that our approach accelerates convergence and enhances training stability. ... In our experiments, we perform feature normalisation onto a hypersphere... Our method underwent rigorous evaluation across various UFM sizes and real model architectures trained on actual datasets, including CIFAR10 [37], CIFAR100 [37], STL10 [14], and Image Net1000 [15], implemented on Res Net [29] and VGG [56] architectures.
Researcher Affiliation Collaboration Evan Markou Australian National University evan.markou@anu.edu.au Thalaiyasingam Ajanthan Australian National University & Amazon thalaiyasingam.ajanthan@anu.edu.au Stephen Gould Australian National University stephen.gould@anu.edu.au
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code Yes Code available at https://github.com/evanmarkou/Guiding-Neural-Collapse.git.
Open Datasets Yes Our method underwent rigorous evaluation across various UFM sizes and real model architectures trained on actual datasets, including CIFAR10 [37], CIFAR100 [37], STL10 [14], and Image Net1000 [15], implemented on Res Net [29] and VGG [56] architectures.
Dataset Splits No Our experiments on real datasets run for 200 epochs with batch size 256; for the UFM analysis, we run 2000 iterations. ... Numerical results for the top-1 train and test accuracy are reported in Tables 1 and 2, respectively.
Hardware Specification Yes All experiments were conducted using Nvidia RTX3090 and A100 GPUs.
Software Dependencies No We solve the Riemannian optimisation problem defined in Equation 7 using a Riemannian Trust-Region method [1] from py Manopt [60]. Following the authors recommendation, we set the gain/momentum parameter to 10 to expedite convergence, aligning it with other widely used optimisers like Adam [36] and SGD.
Experiment Setup Yes Our experiments on real datasets run for 200 epochs with batch size 256; for the UFM analysis, we run 2000 iterations. ... We maintain a proximal coefficient δ set to 10 3 consistently across all experiments. ... Specifically, we set α = 2/(T + 1), where T represents the number of iterations. Additionally, we include a thresholding value of 10 4... Finally, in our experiments, we set the temperature parameter τ to five.