Constructing Deep Neural Networks by Bayesian Network Structure Learning
Authors: Raanan Y. Rohekar, Shami Nisimov, Yaniv Gurwicz, Guy Koren, Gal Novik
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate on image classification benchmarks that the deepest layers (convolutional and dense) of common networks can be replaced by significantly smaller learned structures, while maintaining classification accuracy state-of-the-art on tested benchmarks. Our structure learning algorithm requires a small computational cost and runs efficiently on a standard desktop CPU. |
| Researcher Affiliation | Industry | Raanan Y. Rohekar Intel AI Lab raanan.yehezkel@intel.com Shami Nisimov Intel AI Lab shami.nisimov@intel.com Yaniv Gurwicz Intel AI Lab yaniv.gurwicz@intel.com Guy Koren Intel AI Lab guy.koren@intel.com Gal Novik Intel AI Lab gal.novik@intel.com |
| Pseudocode | Yes | Algorithm 1: G Deep Gen(g X, X, Xex, n) |
| Open Source Code | No | The paper states: 'Our structure learning algorithm is implemented using BNT (Murphy, 2001)', but does not provide a link or explicit statement about making their specific code open source or available. |
| Open Datasets | Yes | MNIST (Le Cun et al., 1998); SVHN (Netzer et al., 2011); CIFAR 10 (Krizhevsky & Hinton, 2009); CIFAR 100 (Krizhevsky & Hinton, 2009); Image Net (Deng et al., 2009) |
| Dataset Splits | No | Threshold for independence tests, and the number of neurons-per-layer were selected by using a validation set. The paper mentions the use of a validation set but does not provide specific split percentages or counts for it. |
| Hardware Specification | No | The paper mentions 'runs efficiently on a standard desktop CPU', which is too general to be a specific hardware detail. |
| Software Dependencies | No | The paper mentions 'Our structure learning algorithm is implemented using BNT (Murphy, 2001)'. While BNT is named, no version number is provided for BNT itself or any other software dependencies. |
| Experiment Setup | No | The paper mentions: 'In all the experiments, we used Re LU activations, ADAM (Kingma & Ba, 2015) optimization, batch normalization (Ioffe & Szegedy, 2015), and dropout (Srivastava et al., 2014) to all the dense layers.' It describes types of settings but does not provide specific hyperparameter values (e.g., learning rate, dropout rate, batch size). |