Learning Parities with Neural Networks
Authors: Amit Daniely, Eran Malach
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 Experiment In section 3 we showed a family of distributions F that separates linear classes from neural-networks. To validate that our theoretical results apply to a more realistic setting, we perform an experiment that imitates the parity problem using the MNIST dataset. ... We compare the performance of a ReLU network with one hidden-layer, against various linear models. ... While the ReLU network achieves performance of almost 80% accuracy, the linear models barely perform better than a chance. The results of the experiment are shown in Figure 1. |
| Researcher Affiliation | Collaboration | Amit Daniely School of Computer Science, The Hebrew University, Israel. Google Research Tel Aviv. amit.daniely@mail.huji.ac.il Eran Malach School of Computer Science, The Hebrew University, Israel. eran.malach@mail.huji.ac.il |
| Pseudocode | No | The paper describes the training algorithm in text and mathematical formulas but does not include a dedicated pseudocode or algorithm block. |
| Open Source Code | No | The paper does not provide any concrete access information for open-source code related to the methodology described. |
| Open Datasets | Yes | To validate that our theoretical results apply to a more realistic setting, we perform an experiment that imitates the parity problem using the MNIST dataset. |
| Dataset Splits | No | The paper mentions using the MNIST dataset but does not explicitly provide specific details on how the dataset was split into training, validation, and test sets, nor does it refer to standard predefined splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Ada Delta optimizer' but does not provide specific software names with version numbers. |
| Experiment Setup | Yes | Our neural-network architecture is a one-hidden layer network with Re LU activation and 512 neurons in the hidden layer. ... All models are trained with Ada Delta optimizer, for 20 epochs, with batch size 128. |