Nonparametric Bayesian Deep Networks with Local Competition
Authors: Konstantinos Panousis, Sotirios Chatzis, Sergios Theodoridis
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | As we experimentally show using benchmark datasets, our approach yields networks with less computational footprint than the state-of-the-art, and with no compromises in predictive accuracy. |
| Researcher Affiliation | Academia | 1Dept. of Informatics & Telecommunications, National and Kapodistrian University of Athens, Greece 2Dept. of Electrical Eng., Computer Eng., and Informatics, Cyprus University of Technology, Limassol, Cyprus 3The Chinese University of Hong Kong, Shenzen, China. |
| Pseudocode | No | The paper describes training and inference algorithms but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statements about releasing source code or provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | We use MNIST in these experiments. ... We train the network from scratch on the MNIST dataset... Finally, we perform experimental evaluations on a more challenging benchmark dataset, namely CIFAR-10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | No | The paper mentions training on MNIST and CIFAR-10 datasets and evaluating on a 'test set', but it does not provide specific details on the train/validation/test dataset splits (e.g., percentages, sample counts, or explicit reference to predefined splits). |
| Hardware Specification | Yes | We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research. |
| Software Dependencies | No | The paper mentions using ADAM for ELBO maximization but does not specify any software names with version numbers for libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | In our experiments, the stick variables are drawn from a Beta(1, 1) prior. The hyperparameters of the approximate Kumaraswamy posteriors of the sticks are initialized as follows: the ak s are set equal to the number of LWTA blocks of their corresponding layer; the bk s are always set equal to 1. All other initializations are random within the corresponding support sets. The employed cut-off threshold, τ, is set to 10 2. |