Deep Linear Discriminant Analysis

Authors: Matthias Dorfer, Rainer Kelz, Gerhard Widmer

ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For evaluation we test our approach on three different benchmark datasets (MNIST, CIFAR-10 and STL-10). Deep LDA produces competitive results on MNIST and CIFAR-10 and outperforms a network trained with categorical cross entropy (having the same architecture) on a supervised setting of STL-10.
Researcher Affiliation Academia Matthias Dorfer, Rainer Kelz & Gerhard Widmer Department of Computational Perception Johannes Kepler University Linz Linz, 4040, AUT {matthias.dorfer, rainer.kelz, gerhard.widmer}@jku.at
Pseudocode No No structured pseudocode or algorithm blocks were found. Appendix A provides mathematical derivations but not algorithmic steps.
Open Source Code No The paper does not provide concrete access to source code for the described methodology. It mentions using 'Theano (Bergstra et al., 2010) and Lasagne (Dieleman et al., 2015)' as deep learning frameworks, but does not state that their own implementation code is released or provide a link.
Open Datasets Yes For evaluation we test our approach on three different benchmark datasets (MNIST, CIFAR-10 and STL-10).
Dataset Splits Yes The MNIST dataset consists of 28 28 gray scale images of handwritten digits ranging from 0 to 9. The dataset is structured into 50000 train samples, 10000 validation samples and 10000 test samples.
Hardware Specification Yes The models are trained on an NVIDIA Tesla K40 with 12GB of GPU memory.
Software Dependencies No The paper mentions the use of 'Theano (Bergstra et al., 2010) and Lasagne (Dieleman et al., 2015)' as deep learning frameworks, but does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup Yes The initial learning rate is set to 0.1 and the momentum is fixed at 0.9 for all our models. The learning rate is then halved every 25 epochs for CIFAR-10 and STL-10 and every 10 epochs for MNIST. For further regularization we add weight decay with a weighting of 0.0001 on all trainable parameters of the models. The between-class covariance matrix regularization weight λ (see Section 3.3) is set to 0.001 and the ϵ-offset for Deep LDA to 1. The mini-batches for Deep LDA were for MNIST: 1000, for CIFAR10: 1000 and for STL-10: 200. For CCE training, a batch size of 128 is used for all datasets.