reproducibilityindex.ai

On Plasticity, Invariance, and Mutually Frozen Weights in Sequential Task Learning

Authors: Julian Zilly, Alessandro Achille, Andrea Censi, Emilio Frazzoli

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We explore the ideas and connections of mutually frozen weights and invariance and their impact in a number of sequential learning settings across different datasets, network architectures, learning rates, and weight decay settings. Detailed speciﬁcations will be provided in the appendix. The following subsections are meant to give further evidence for the following statements: Mutually frozen weights occur and are different from weights that are zero (sparse) but not mutually frozen. Both sufﬁciently high learning rates and weight decay are essential for the occurrence of mutually frozen weights and ﬁnal test performance as tested across two different architectures. Mutually frozen weights at the beginning of training can be harmful yet can be removed through a resetting intervention . Across a number of task changes, removing frozen weights is beneﬁcial as long as sufﬁciently many retraining samples are available. We provide an analysis summary relating frozen weights, invariance, and performance.
Researcher Affiliation	Academia	Julian Zilly ETH Zürich jzilly@ethz.ch Alessandro Achille Caltech aachille@caltech.edu Andrea Censi ETH Zürich acensi@ethz.ch Emilio Frazzoli ETH Zürich efrazzoli@ethz.ch
Pseudocode	No	The paper describes methods and interventions in text but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Yes, we include our code base, to allow others to easily reproduce our results.
Open Datasets	Yes	Res Net18 trained on CIFAR-10 images with weight decay λ =1e-3. (Fig 3) and Table 2: Task change Model Reset Test acc. Test acc. Double/Triple FW CIFAR-10 [49] ! CIFAR-100 [49] Res Net18 ... MNIST [50] ! Fashion MNIST [51] Res Net18 ... Image Net!Fashion MNIST [51] Res Net50...
Dataset Splits	No	The paper uses standard datasets like CIFAR-10 and ImageNet, which typically have predefined splits. However, it does not explicitly state the specific training, validation, and test split percentages or sample counts within the provided text, nor does it cite a specific work that defines the exact splits used.
Hardware Specification	No	The paper states 'We do include the use of resources but do not have a precise estimate of the total amount of hours trained.' in the checklist, but it does not provide specific details about the type of GPUs, CPUs, or other hardware models used for the experiments within the main body of the paper.
Software Dependencies	No	The paper mentions using 'code frameworks such as Pytorch' in the checklist but does not specify any version numbers for Pytorch or any other software dependencies.
Experiment Setup	Yes	We explore the ideas and connections of mutually frozen weights and invariance and their impact in a number of sequential learning settings across different datasets, network architectures, learning rates, and weight decay settings. Detailed speciﬁcations will be provided in the appendix. For pretraining on blurred CIFAR-10 images and then switching to regular sharp images, we show the test loss for the Res Net-18 and All-CNN architectures across different initial learning rates and weight decay settings.