Diversity Networks
Authors: Zelda Mariet, Suvrit Sra
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experimental results to corroborate our claims: for pruning neural networks, Divnet is seen to be notably superior to competing approaches. |
| Researcher Affiliation | Academia | Zelda Mariet and Suvrit Sra Massachusetts Institute of Technology Cambridge, MA 02139, USA zelda@csail.mit.edu,suvrit@mit.edu |
| Pseudocode | No | The paper describes the methodology in prose and mathematical equations but does not present a formal pseudocode or algorithm block. |
| Open Source Code | No | The paper states it was 'Run in MATLAB, based on the code from Deep Learn Tool Box (https://github.com/ rasmusbergpalm/Deep Learn Toolbox) and Alex Kulesza’s code for DPPs (http://web.eecs.umich. edu/~kulesza/)' which indicates use of third-party code, but there is no explicit statement or link indicating that the authors' own code for Divnet is being released. |
| Open Datasets | Yes | To quantify the performance of our algorithm, we present below the results of experiments1 on common datasets for neural network evaluation: MNIST (Le Cun and Cortes, 2010), MNIST ROT (Larochelle et al., 2007) and CIFAR-10 (Krizhevsky, 2009). |
| Dataset Splits | No | The paper mentions using training and test data but does not explicitly define a validation split or specific percentages for data partitioning to enable reproducibility of the data splits. |
| Hardware Specification | Yes | Run in MATLAB, based on the code from Deep Learn Tool Box... on a Linux Mint system with 16GB of RAM and an i7-4710HQ CPU @ 2.50GHz. |
| Software Dependencies | No | The paper mentions 'MATLAB' and third-party code libraries ('Deep Learn Tool Box', 'Alex Kulesza’s code for DPPs') but does not specify version numbers for these software components, making replication difficult. |
| Experiment Setup | Yes | All networks were trained up until a certain training error threshold, using softmax activation on the output layer and sigmoids on other layers; see Table 1 for more details. Table 1: Overview of the sets of networks used in the experiments. We train each class of networks until the first iteration of backprop for which the training error reaches a predefined threshold. Dataset Instances Trained up until Architecture MNIST 5 < 1% error 784 500 500 10 MNIST ROT 5 < 1% error 784 500 500 10 CIFAR-10 5 < 50% error 3072 1000 1000 1000 10. To ensure strict positive definiteness of the kernel matrix L , we add a small diagonal perturbation εI to L (ε = 0.01). ...we use the fixed choice β = 10/|T |, which was experimentally seen to work well. |