Random deep neural networks are biased towards simple functions
Authors: Giacomo De Palma, Bobak Kiani, Seth Lloyd
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate all the theoretical results with numerical experiments on deep neural networks with Re LU activation function and two hidden layers. The experiments confirm the scalings Θ( p n/ ln n) and Θ(n) for the Hamming distance of the closest string with a different classification and for the average random flips required to change the classification, respectively. |
| Researcher Affiliation | Academia | Giacomo De Palma Mech E & RLE MIT Cambridge MA 02139, USA gdepalma@mit.edu Bobak T. Kiani Mech E & RLE MIT Cambridge MA 02139, USA bkiani@mit.edu Seth Lloyd Mech E, Physics & RLE MIT Cambridge MA 02139, USA slloyd@mit.edu |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper mentions using Keras and TensorFlow but does not provide a link or statement for the open-sourcing of their own specific implementation code. |
| Open Datasets | Yes | Moreover, we explore the Hamming distance to the closest bit string with a different classification on deep neural networks trained on the MNIST database [49] of hand-written digits. |
| Dataset Splits | No | The paper mentions training on MNIST and evaluating on a test set, but does not provide specific details on train/validation/test splits (e.g., percentages or exact counts for each split, or mention of a validation set). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running experiments. |
| Software Dependencies | No | Simulations were run using the python package Keras with a backend of Tensor Flow [68]. However, specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | Weights for all neural networks are initialized according to a normal distribution with zero mean and variance equal to 2/nin, where nin is the number of input units in the weight tensor. No bias term is included in the neural networks. All networks consist of two fully connected hidden layers, each with n neurons (equal to number of input neurons) and activation function set to the commonly used Rectified Linear Unit (Re LU). All networks contain a single output neuron with no activation function. In the notation of section 2, this choice corresponds to σ2 w = 2, σ2 b = 0, n0 = n1 = n2 = n and n3 = 1, and implies F (1) = 1. Simulations were run using the python package Keras with a backend of Tensor Flow [68]. 400 Networks were trained for 20 epochs using the Adam optimizer [67]; average test set accuracy of 98.8% was achieved. |