Learning Deep Parsimonious Representations

Authors: Renjie Liao, Alex Schwing, Richard Zemel, Raquel Urtasun

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of our approach on the tasks of unsupervised learning, classification, fine grained categorization, and zero-shot learning. We demonstrate the generalization performance of our proposed method in several settings: autoencoders trained on the MNIST dataset [23], classification on CIFAR10 and CIFAR100 [20], as well as fine-grained classification and zero-shot learning on the CUB-200-2011 dataset [34]. We show that our approach leads to significant wins in all these scenarios.
Researcher Affiliation Academia University of Toronto1 University of Illinois at Urbana-Champaign2 Canadian Institute for Advanced Research3
Pseudocode Yes Algorithm 1 : Learning Parsimonious Representations
Open Source Code Yes Our implementation based on Tensor Flow [9] is publicly available.1 1https://github.com/lrjconan/deep_parsimonious
Open Datasets Yes We demonstrate the generalization performance of our proposed method in several settings: autoencoders trained on the MNIST dataset [23], classification on CIFAR10 and CIFAR100 [20], as well as fine-grained classification and zero-shot learning on the CUB-200-2011 dataset [34].
Dataset Splits Yes The standard training-test-split is used. We use the standard split on both datasets. We follow the dataset split provided by [34] and the common practice of cropping the image using the ground-truth bounding box annotation of the birds [8, 36]. We follow the setting in [1, 2] and use the same split where 100, 50 and 50 classes are used as training, validation and testing (unseen classes).
Hardware Specification No The paper does not specify any hardware details such as GPU/CPU models, memory, or cloud computing instances used for the experiments.
Software Dependencies No The paper mentions using 'Tensor Flow [9]' but does not provide a specific version number for it or any other software dependencies.
Experiment Setup Yes The number of clusters and the regularization weight λ of all layers are set to 100 and 1.0e 2 respectively. Our exact parameter choices are detailed in the Appendix. Specifically, the number of cluster centers are set to 100 for all layers for both CIFAR10 and CIFAR100. λ is set to 1.0e 3 and 1.0e 2 for the first two convolutional and the remaining layers respectively in CIFAR10; for CIFAR100, λ is set to 10 and 1 for the first convolutional layer and the remaining layers respectively. The smoothness parameter α is set to 0.9 and 0.95 for CIFAR10 and CIFAR100 respectively. Based on cross-validation, the number of clusters are set to 200 for all layers. For convolutional layers, we set λ to 1.0e 5 for the first (bottom) 2 and use 1.0e 4 for the remaining ones. For fully connected layers, we set λ to 1.0e 3 and α is equal to 0.5.