Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity

Authors: Zhanpeng Zhou, Yongyi Yang, Xiaojiang Yang, Junchi Yan, Wei Hu

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide comprehensive empirical evidence for LLFC across a wide range of settings, demonstrating that whenever two trained networks satisfy LMC (via either spawning or permutation methods), they also satisfy LLFC in nearly all the layers.
Researcher Affiliation Academia 1 Dept. of Computer Science and Engineering & Mo E Key Lab of AI, Shanghai Jiao Tong University 2 Dept. of Electrical Engineering & Computer Science, University of Michigan
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes We released our source code at https://github.com/zzp1012/LLFC.
Open Datasets Yes we perform our experiments on commonly used image classification datasets MNIST [18], CIFAR10 [15], and Tiny-Image Net[17]
Dataset Splits No The paper mentions using a "training set" and "test set" for evaluations but does not provide specific details on a validation set or its split.
Hardware Specification No The paper does not specify the hardware used for experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions optimization algorithms like Adam and SGD but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes The detailed settings and hyper-parameters are also described in Appendix B.1. Examples from B.1: "Optimization is done with the Adam algorithm and a learning rate of 1.2 10 4. The batch size is set to 60 and the total number of training epochs is 30."; "Optimization is done using SGD with momentum (momentum set to 0.9). A weight decay of 1 10 4 is applied. The learning rate is initialized at 0.1 and is dropped by 10 times at 80 and 120 epochs. The total number of epochs is 160."