Neural Collapse in Multi-label Learning with Pick-all-label Loss
Authors: Pengyu Li, Xiao Li, Yutong Wang, Qing Qu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we first conduct a series of experiments to demonstrate and analyze the M-lab NC on different practical deep networks with various multi-label datasets. Second, we show that the geometric structure of M-lab NC could efficiently guide M-lab learning in both testing and training stage for better performance. |
| Researcher Affiliation | Academia | 1Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, USA. |
| Pseudocode | No | The paper does not contain pseudocode or explicit algorithm blocks. |
| Open Source Code | Yes | Our code is publicly available at https://github. com/Heimine/NC_MLab/. |
| Open Datasets | Yes | The datasets used in our experiment are real-world Mlab SVHN (Netzer et al., 2011), along with synthetically generated M-lab MNIST (Le Cun et al., 2010) and M-lab Cifar10 (Krizhevsky et al., 2009). |
| Dataset Splits | No | The paper states 'The testing datasets are independently generated, each with a sample size equivalent to 20% of the training datasets.' but does not explicitly detail a separate validation split. |
| Hardware Specification | No | The paper mentions training models like Res Net18 and Res Nets, and acknowledges support from 'Cloud Bank', but does not provide specific details on hardware specifications such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions using 'SGD optimizer' and 'PyTorch (Paszke et al., 2019)' but does not provide specific version numbers for software dependencies like PyTorch, Python, or CUDA. |
| Experiment Setup | Yes | Throughout all the experiments, we use an SGD optimizer with fixed batch size 128, weight decay (λW , λH) = (5 * 10^-4, 5 * 10^-4) and momentum 0.9. The learning rate is initially set to 1 * 10^-1 and dynamically decays to 1 * 10^-3 following a Cosine Annealing learning rate scheduler... The total number of epochs is set to 200 for all experiments. |