Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Lifted Bregman Training of Neural Networks
Authors: Xiaoyu Wang, Martin Benning
JMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding compared to more conventional training frameworks. In this section, we present numerical results for the example applications described in Section 5. |
| Researcher Affiliation | Academia | Xiaoyu Wang EMAIL Department of Applied Mathematics and Theoretical Physics University of Cambridge Cambridge, CB3 0WA, UK Martin Benning EMAIL School of Mathematical Sciences Queen Mary University of London London, E1 4NS, UK |
| Pseudocode | Yes | Algorithm 1 Implicit Stochastic Lifted Bregman Learning (Page 13) and Algorithm 2 Back-Propagation Algorithm (Appendix A, Page 23). |
| Open Source Code | Yes | Code related to this publication is made available through the University of Cambridge data repository at https://doi.org/10.17863/CAM.86729. |
| Open Datasets | Yes | We use the MNIST dataset (Le Cun et al., 1998) and the Fashion-MNIST dataset (Xiao et al., 2017) for all numerical experiments. |
| Dataset Splits | Yes | For both the MNIST and Fashion-MNIST datasets, we use 60,000 images for training and 10,000 images for validation. ... We choose s = 1, 000 images from the MNIST dataset at random and use it as our training dataset and use all 10, 000 images from the validation dataset for validation. ... For the first scenario, we take 1,000 training images (referred to as Fashion-MNIST-1K) and validate on 10,000 images. ... For the second scenario, the training dataset consists of 10,000 images (referred to as Fashion-MNIST-10K) and the validation dataset consists of 10,000 images. |
| Hardware Specification | Yes | All results have been computed using Py Torch 3.7 on an Intel Xeon CPU E5-2630 v4. |
| Software Dependencies | Yes | All results have been computed using Py Torch 3.7 on an Intel Xeon CPU E5-2630 v4. |
| Experiment Setup | Yes | For the classification task, we follow the work of Zach and Estellers (2019) and consider a fully connected network with L = 4 layers with Re LU activation functions. More specifically, we use f(xl 1, Θl) = W l xl 1 + bl with Wl Rml 1 ml and bl Rml 1 where m1 = 784, m2 = m3 = 64 and m4 = 10. ... In solving each mini-batch sub-problem (24), we run a maximum of N = 15 iterations with τ k = 0.25 for all k. ... Out of the learning rates {1 10 5, 5 10 5, 1 10 4, 5 10 4, 1 10 3, 5 10 3, 1 10 2, 5 10 2, 1 10 1, 5 10 1}, we found that 1 10 3 works best empirically... Network parameters in all experiments are identically initialised following Glorot and Bengio (2010). We choose batch size |Bk| = 100 and train the network for 100 epochs. |