Going Beyond Linear Mode Connectivity: The Layerwise Linear Feature Connectivity
Authors: Zhanpeng Zhou, Yongyi Yang, Xiaojiang Yang, Junchi Yan, Wei Hu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide comprehensive empirical evidence for LLFC across a wide range of settings, demonstrating that whenever two trained networks satisfy LMC (via either spawning or permutation methods), they also satisfy LLFC in nearly all the layers. |
| Researcher Affiliation | Academia | 1 Dept. of Computer Science and Engineering & Mo E Key Lab of AI, Shanghai Jiao Tong University 2 Dept. of Electrical Engineering & Computer Science, University of Michigan |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | We released our source code at https://github.com/zzp1012/LLFC. |
| Open Datasets | Yes | we perform our experiments on commonly used image classification datasets MNIST [18], CIFAR10 [15], and Tiny-Image Net[17] |
| Dataset Splits | No | The paper mentions using a "training set" and "test set" for evaluations but does not provide specific details on a validation set or its split. |
| Hardware Specification | No | The paper does not specify the hardware used for experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions optimization algorithms like Adam and SGD but does not provide specific version numbers for any software dependencies (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The detailed settings and hyper-parameters are also described in Appendix B.1. Examples from B.1: "Optimization is done with the Adam algorithm and a learning rate of 1.2 10 4. The batch size is set to 60 and the total number of training epochs is 30."; "Optimization is done using SGD with momentum (momentum set to 0.9). A weight decay of 1 10 4 is applied. The learning rate is initialized at 0.1 and is dropped by 10 times at 80 and 120 epochs. The total number of epochs is 160." |