Bregman Neural Networks
Authors: Jordan Frecon, Gilles Gasso, Massimiliano Pontil, Saverio Salzo
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments suggest that the proposed Bregman variant benefits from better learning properties and more robust prediction performance.In this section, we compare both standard multilayer perceptrons and residual networks against their proposed Bregman variants. |
| Researcher Affiliation | Academia | 1 Normandie Univ, INSA Rouen UNIROUEN, UNIHAVRE, LITIS, Saint-Etienne-du-Rouvray, France 2 Computational Statistics and Machine Learning, IIT, Genova, Italy 3 Departement of Computer Science, UCL, London, United Kingdom. |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found. The paper provides mathematical formulations and equations for the proposed layers and problems. |
| Open Source Code | Yes | A Pytorch package is publicly available2. 2 http://github.com/Jordan Frecon/Bregma Net |
| Open Datasets | Yes | The two-spiral dataset is a widely used benchmark for binary classification (Chalup & Wiklendt, 2007)., We now turn to the popular MNIST dataset, We now conduct experiments on the CIFAR-10 dataset which consists of 50k training images and 10k testing images in 10 classes (Krizhevsky & Hinton, 2009). |
| Dataset Splits | Yes | The original un-deformed training dataset is then use as validation set to perform early stopping. Both validation and test sets are flatten and rescaled in the activation range. |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments are provided. The paper only mentions that a 'Pytorch package is publicly available'. |
| Software Dependencies | No | The paper mentions the use of 'Pytorch package' but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The batch-size, learning rate and number of epochs are set to 16, 10 2 and 500, respectively., Concerning the optimization, we use the same setting as in the original Res Net paper (He et al., 2016), namely a learning rate of 10 1, weight decay of 10 4, momentum of 0.9. In addition, the optimization is done over 182 epochs with a decreasing of the learning by a factor 10 at the 91th and 136th epochs., All MLPs are trained using a stochastic gradient descent with batch size 100 with a decreasing step learning linearly decreasing of a factor 10 3 over 103 epochs. The initial learning rate is cross-validated over {10 3,10 2,10 1}. |