Guiding Neural Collapse: Optimising Towards the Nearest Simplex Equiangular Tight Frame
Authors: Evan Markou, Thalaiyasingam Ajanthan, Stephen Gould
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments on synthetic and real-world architectures for classification tasks demonstrate that our approach accelerates convergence and enhances training stability. ... In our experiments, we perform feature normalisation onto a hypersphere... Our method underwent rigorous evaluation across various UFM sizes and real model architectures trained on actual datasets, including CIFAR10 [37], CIFAR100 [37], STL10 [14], and Image Net1000 [15], implemented on Res Net [29] and VGG [56] architectures. |
| Researcher Affiliation | Collaboration | Evan Markou Australian National University evan.markou@anu.edu.au Thalaiyasingam Ajanthan Australian National University & Amazon thalaiyasingam.ajanthan@anu.edu.au Stephen Gould Australian National University stephen.gould@anu.edu.au |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available at https://github.com/evanmarkou/Guiding-Neural-Collapse.git. |
| Open Datasets | Yes | Our method underwent rigorous evaluation across various UFM sizes and real model architectures trained on actual datasets, including CIFAR10 [37], CIFAR100 [37], STL10 [14], and Image Net1000 [15], implemented on Res Net [29] and VGG [56] architectures. |
| Dataset Splits | No | Our experiments on real datasets run for 200 epochs with batch size 256; for the UFM analysis, we run 2000 iterations. ... Numerical results for the top-1 train and test accuracy are reported in Tables 1 and 2, respectively. |
| Hardware Specification | Yes | All experiments were conducted using Nvidia RTX3090 and A100 GPUs. |
| Software Dependencies | No | We solve the Riemannian optimisation problem defined in Equation 7 using a Riemannian Trust-Region method [1] from py Manopt [60]. Following the authors recommendation, we set the gain/momentum parameter to 10 to expedite convergence, aligning it with other widely used optimisers like Adam [36] and SGD. |
| Experiment Setup | Yes | Our experiments on real datasets run for 200 epochs with batch size 256; for the UFM analysis, we run 2000 iterations. ... We maintain a proximal coefficient δ set to 10 3 consistently across all experiments. ... Specifically, we set α = 2/(T + 1), where T represents the number of iterations. Additionally, we include a thresholding value of 10 4... Finally, in our experiments, we set the temperature parameter τ to five. |