Understanding and Leveraging the Learning Phases of Neural Networks

Authors: Johannes Schneider, Mohit Prabhushankar

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show the existence of three phases using common datasets and architectures such as Res Net and VGG: (i) near constant reconstruction loss, (ii) decrease, and (iii) increase. We also derive an empirically grounded data model and prove the existence of phases for single-layer networks.
Researcher Affiliation Academia Johannes Schneider1 , Mohit Prabhushankar 2 1University of Liechtenstein, Vaduz, Liechtenstein 2Georgia Institute of Technology, Atlanta, USA
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes *Code/Proofs: https://github.com/JohnTailor/LearnPhase
Open Datasets Yes We used CIFAR-10/100 (Krizhevsky and Hinton 2009), Fashion MNIST (Xiao, Rasul, and Vollgraf 2017), and MNIST (Deng 2012), all scaled to 32x32, available under the MIT (first 3 datasets) and GNU 3.0 license.
Dataset Splits No The paper mentions using CIFAR-10/100, Fashion MNIST, and MNIST, but does not explicitly provide specific train/validation/test dataset split percentages, sample counts, or direct references to predefined splits used for their experiments.
Hardware Specification No No specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments are provided in the paper.
Software Dependencies No The paper mentions optimizers like 'Adam optimizer' and activation functions like 'Re LU' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions).
Experiment Setup Yes We used a fixed learning rate of 0.002 and stochastic gradient descent with batches of size 128 training for 256 epochs. [...] For each computation of the metrics, we trained the decoder for 30 epochs using the Adam optimizer with a learning rate of 0.0003.