The Description Length of Deep Learning models
Authors: Léonard Blier, Yann Ollivier
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate experimentally the ability of deep neural networks to compress the training data even when accounting for parameter encoding. In this work we explicitly measure how much current deep models actually compress data. |
| Researcher Affiliation | Collaboration | Léonard Blier École Normale Supérieure Paris, France leonard.blier@normalesup.org Yann Ollivier Facebook Artificial Intelligence Research Paris, France yol@fb.com |
| Pseudocode | Yes | Algorithm 2 in Appendix D |
| Open Source Code | Yes | C. Tallec and L. Blier. Pyvarinf : Variational Inference for Py Torch, 2018. URL https://github. com/ctallec/pyvarinf. (L. Blier is an author of the paper). |
| Open Datasets | Yes | Our running example will be image classification on the MNIST (Le Cun et al., 1998) and CIFAR10 (Krizhevsky, 2009) datasets. |
| Dataset Splits | Yes | Table 1: Compression bounds via Deep Learning. Compression bounds given by different codes on two datasets, MNIST and CIFAR10. ... The test accuracy of a model is the accuracy of its predictions on the test set. On MNIST, this provides a codelength of the labels (knowing the inputs) of 24.1 kbits... The corresponding model achieved 95.5% accuracy on the test set. |
| Hardware Specification | No | The paper does not specify any details about the hardware used for running the experiments (e.g., specific GPU or CPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper mentions 'Pyvarinf : Variational Inference for Py Torch' and refers to PyTorch implicitly, but it does not specify concrete version numbers for PyTorch or any other software libraries or dependencies. |
| Experiment Setup | Yes | Neural networks that give the best variational compression bounds appear to be smaller than networks trained the usual way. We tested various fully connected networks and convolutional networks (Appendix C): the models that gave the best variational compression bounds were small Le Net-like networks. On CIFAR, we tested a simple multilayer perceptron, a shallow network, a small convolutional network, and a VGG convolutional network (Simonyan and Zisserman, 2014) first without data augmentation or batch normalization (VGGa) (Ioffe and Szegedy, 2015), then with both of them (VGGb) (Appendix D). |