Learning Deep Parsimonious Representations
Authors: Renjie Liao, Alex Schwing, Richard Zemel, Raquel Urtasun
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of our approach on the tasks of unsupervised learning, classification, fine grained categorization, and zero-shot learning. We demonstrate the generalization performance of our proposed method in several settings: autoencoders trained on the MNIST dataset [23], classification on CIFAR10 and CIFAR100 [20], as well as fine-grained classification and zero-shot learning on the CUB-200-2011 dataset [34]. We show that our approach leads to significant wins in all these scenarios. |
| Researcher Affiliation | Academia | University of Toronto1 University of Illinois at Urbana-Champaign2 Canadian Institute for Advanced Research3 |
| Pseudocode | Yes | Algorithm 1 : Learning Parsimonious Representations |
| Open Source Code | Yes | Our implementation based on Tensor Flow [9] is publicly available.1 1https://github.com/lrjconan/deep_parsimonious |
| Open Datasets | Yes | We demonstrate the generalization performance of our proposed method in several settings: autoencoders trained on the MNIST dataset [23], classification on CIFAR10 and CIFAR100 [20], as well as fine-grained classification and zero-shot learning on the CUB-200-2011 dataset [34]. |
| Dataset Splits | Yes | The standard training-test-split is used. We use the standard split on both datasets. We follow the dataset split provided by [34] and the common practice of cropping the image using the ground-truth bounding box annotation of the birds [8, 36]. We follow the setting in [1, 2] and use the same split where 100, 50 and 50 classes are used as training, validation and testing (unseen classes). |
| Hardware Specification | No | The paper does not specify any hardware details such as GPU/CPU models, memory, or cloud computing instances used for the experiments. |
| Software Dependencies | No | The paper mentions using 'Tensor Flow [9]' but does not provide a specific version number for it or any other software dependencies. |
| Experiment Setup | Yes | The number of clusters and the regularization weight λ of all layers are set to 100 and 1.0e 2 respectively. Our exact parameter choices are detailed in the Appendix. Specifically, the number of cluster centers are set to 100 for all layers for both CIFAR10 and CIFAR100. λ is set to 1.0e 3 and 1.0e 2 for the first two convolutional and the remaining layers respectively in CIFAR10; for CIFAR100, λ is set to 10 and 1 for the first convolutional layer and the remaining layers respectively. The smoothness parameter α is set to 0.9 and 0.95 for CIFAR10 and CIFAR100 respectively. Based on cross-validation, the number of clusters are set to 200 for all layers. For convolutional layers, we set λ to 1.0e 5 for the first (bottom) 2 and use 1.0e 4 for the remaining ones. For fully connected layers, we set λ to 1.0e 3 and α is equal to 0.5. |