Quantifying the Variability Collapse of Neural Networks

Authors: Jing Xu, Haoxiong Liu

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments verify that VCI is indicative of the variability collapse and the transferability of pretrained neural networks.
Researcher Affiliation Academia 1Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China.
Pseudocode No No pseudocode or clearly labeled algorithm blocks were found.
Open Source Code No The paper does not contain any explicit statement about releasing source code or a link to a code repository for the described methodology.
Open Datasets Yes We evaluate the metrics on the feature layer of Res Net18 (He et al., 2016) trained on CIFAR10 (Krizhevsky et al., 2009) and Res Net50 / variants of Vi T (Dosovitskiy et al., 2020) trained on Image Net-1K with Auto Augment (Cubuk et al., 2018) for 300 epochs.
Dataset Splits No We use L-BFGS to train the linear classifier, with the optimal L2-penalty strength determined by searching through 97 logarithmically spaced values between 10 6 and 106 on a validation set.
Hardware Specification Yes Res Net18s are trained on one NVIDIA Ge Force RTX 3090 GPU, Res Net50s and Vi T variants are trained on four GPUs.
Software Dependencies No The paper mentions software such as PyTorch, torchvision, and optimizers like SGD and AdamW, but does not provide specific version numbers for any of these components.
Experiment Setup Yes The batchsize for each GPU is set to 256. The maximum learning rate is set to 0.1 batch size/256. We try both the cosine annealing and step-wise learning rate decay scheduler. When using a step-wise learning rate decay schedule, the learning rate is decayed by a factor of 0.975 every epoch. We also use a linear warmpup procedure of 10 epochs, starting from an initial 10 5 learning rate. The weight-decay factor is set to 8 10 5.