Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective

Authors: Haixiang Sun, Ye Shi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we validate our theoretical analyses through experiments in both balanced and imbalanced scenarios.
Researcher Affiliation Academia Haixiang Sun Shanghai Tech University sunhx@shanghaitech.edu.cn Ye Shi Shanghai Tech University shiye@shanghaitech.edu.cn
Pseudocode No No structured pseudocode or algorithm blocks were found.
Open Source Code Yes We will release the code once the paper is accepted.
Open Datasets Yes Experimental results on Cifar-10 and Cifar-100 validated our theoretical findings for distinguishing the differences between DEQ and explicit neural networks.
Dataset Splits Yes We conducted experiments with varying configurations with different numbers of majority and minority classes and imbalance degrees. Assume the numbers of majority and minority classes are (KA, KB) with corresponding sample sizes (n A, n B), the imbalance degree is denoted as R = n A/n B.We considered different setups for majority and minority class quantities, such as (3, 7), (5, 5), and (7, 3). Additionally, we varied the ratio of sample quantities R between majority and minority classes with values of 10, 50 and 100.
Hardware Specification Yes All experiments were implemented using Py Torch on NVIDIA Tesla A40 48GB.
Software Dependencies No The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with version numbers.
Experiment Setup Yes we implement the solver with a threshold ϵ set to 10 3 and introduce an early stopping mechanism. If convergence is not achieved within T > 20 iterations, we terminate the fixed-point iteration. During training, we set the learning rate to 1 10 4 and utilize stochastic gradient descent with a momentum of 0.9 and weight decay of 5 10 4. Both EW and EH are set to 0.01. The training phase for each network consists of 100 epochs, with a batch size of 128.