Understanding and Leveraging the Learning Phases of Neural Networks
Authors: Johannes Schneider, Mohit Prabhushankar
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically show the existence of three phases using common datasets and architectures such as Res Net and VGG: (i) near constant reconstruction loss, (ii) decrease, and (iii) increase. We also derive an empirically grounded data model and prove the existence of phases for single-layer networks. |
| Researcher Affiliation | Academia | Johannes Schneider1 , Mohit Prabhushankar 2 1University of Liechtenstein, Vaduz, Liechtenstein 2Georgia Institute of Technology, Atlanta, USA |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | *Code/Proofs: https://github.com/JohnTailor/LearnPhase |
| Open Datasets | Yes | We used CIFAR-10/100 (Krizhevsky and Hinton 2009), Fashion MNIST (Xiao, Rasul, and Vollgraf 2017), and MNIST (Deng 2012), all scaled to 32x32, available under the MIT (first 3 datasets) and GNU 3.0 license. |
| Dataset Splits | No | The paper mentions using CIFAR-10/100, Fashion MNIST, and MNIST, but does not explicitly provide specific train/validation/test dataset split percentages, sample counts, or direct references to predefined splits used for their experiments. |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions optimizers like 'Adam optimizer' and activation functions like 'Re LU' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or specific library versions). |
| Experiment Setup | Yes | We used a fixed learning rate of 0.002 and stochastic gradient descent with batches of size 128 training for 256 epochs. [...] For each computation of the metrics, we trained the decoder for 30 epochs using the Adam optimizer with a learning rate of 0.0003. |