Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations
Authors: Alexander Immer, Tycho van der Ouderaa, Gunnar Rätsch, Vincent Fortuin, Mark van der Wilk
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate experimentally that our method can differentiably learn useful distributions over affine invariances, which are common data augmentations, on various versions of the image classification datasets MNIST, Fashion MNIST, and CIFAR-10, without validation data. and 5 Experiments We evaluate our method that learns invariances using Laplace approximations (LILA) by optimising affine invariances on different MNIST (Le Cun and Cortes, 2010), Fashion MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky et al., 2009) classification tasks. |
| Researcher Affiliation | Academia | 1Department of Computer Science, ETH Zurich, Switzerland 2Max Planck Institute for Intelligent Systems, Tübingen, Germany 3Department of Computing, Imperial College London, UK 4Department of Engineering, University of Cambridge, UK |
| Pseudocode | No | The paper describes the approach using textual explanations and mathematical formulations, along with diagrams, but does not include a dedicated 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | The code is available at https://github.com/tychovdo/lila |
| Open Datasets | Yes | MNIST (Le Cun and Cortes, 2010), Fashion MNIST (Xiao et al., 2017), and CIFAR-10 (Krizhevsky et al., 2009) classification tasks. |
| Dataset Splits | Yes | We use the standard splits of MNIST, Fashion MNIST and CIFAR-10. and For the data efficiency experiments in Sec. 5.3 we use random subsets of the training data. For these experiments, we create 3 random subsets of sizes [1000, 2000, 5000, 10000, 20000, 30000, 40000, 50000] for CIFAR-10, [1000, 2000, 5000, 10000, 20000, 30000, 40000, 50000, 60000] for MNIST and F-MNIST. |
| Hardware Specification | No | The paper states 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]' in Section 3d of the 'Questions for Paper Analysis'. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | We use the Adam optimizer (Kingma and Ba, 2015) with a learning rate of 10 3 for all models. We train for 500 epochs with a batch size of 256. For CIFAR-10 experiments, we additionally use an early stopping callback on the marginal likelihood that terminates training after 20 epochs of no improvement, and reduce the learning rate by a factor of 10 after 10 epochs of no improvement. |