Transferring Learning Trajectories of Neural Networks

Authors: Daiki Chijiwa

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show that the transferred parameters achieve non-trivial accuracy before any direct training, and can be trained significantly faster than training from scratch.
Researcher Affiliation Industry Daiki Chijiwa NTT Computer and Data Science Laboratories, NTT Corporation
Pseudocode Yes Algorithm 1 Gradient Matching along Trajectory (GMT)
Open Source Code No The paper does not explicitly provide a link to open-source code for the methodology or state that the code is available.
Open Datasets Yes MNIST (Le Cun et al., 1998) is a dataset of 28 28 images of hand-written digits, which is available under the terms of the CC BY-SA 3.0 license.
Dataset Splits Yes For all datasets, we split the officially given training dataset into 9:1 for training and validation.
Hardware Specification Yes Our computing environment is a machine with 12 Intel CPUs, 140 GB CPU memory and a single A100 GPU.
Software Dependencies No The paper mentions 'Python 3' and 'Py Torch library' but does not specify their version numbers.
Experiment Setup Yes We used E = 15, B = 128, α = 0.01, λ = 0.0, µ = 0.9.