On the Depth of Deep Neural Networks: A Theoretical View
Authors: Shizhao Sun, Wei Chen, Liwei Wang, Xiaoguang Liu, Tie-Yan Liu
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that in this way, we achieve significantly better test performance. We have conducted extensive experiments on benchmark datasets to test the performance of LMDNN. |
| Researcher Affiliation | Collaboration | Shizhao Sun,1, Wei Chen,2 Liwei Wang,3 Xiaoguang Liu,1 and Tie-Yan Liu2 1College of Computer and Control Engineering, Nankai University, Tianjin, 300071, P. R. China 2Microsoft Research, Beijing, 100080, P. R. China 3Key Laboratory of Machine Perception (MOE), School of EECS, Peking University, Beijing, 100871, P. R. China |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We conducted experiments on two datasets, MNIST (Le Cun et al. 1998) and CIFAR-10 (Krizhevsky 2009). |
| Dataset Splits | No | The paper mentions tuning hyperparameters on a 'validation set' but does not provide specific dataset split information (e.g., percentages, sample counts, or predefined splits) for training, validation, or testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Caffe (Jia et al. 2014)' but does not provide specific version numbers for software dependencies. |
| Experiment Setup | No | The paper mentions using 'well-tuned network structures' and tuning a 'margin penalty coefficient λ' and that 'Each model was trained for 10 times with different initializations', but it does not provide concrete hyperparameter values or detailed training configurations (e.g., learning rate, batch size, optimal λ values) in the main text for reproduction. |