Learning Parities with Neural Networks

Authors: Amit Daniely, Eran Malach

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 Experiment In section 3 we showed a family of distributions F that separates linear classes from neural-networks. To validate that our theoretical results apply to a more realistic setting, we perform an experiment that imitates the parity problem using the MNIST dataset. ... We compare the performance of a ReLU network with one hidden-layer, against various linear models. ... While the ReLU network achieves performance of almost 80% accuracy, the linear models barely perform better than a chance. The results of the experiment are shown in Figure 1.
Researcher Affiliation Collaboration Amit Daniely School of Computer Science, The Hebrew University, Israel. Google Research Tel Aviv. amit.daniely@mail.huji.ac.il Eran Malach School of Computer Science, The Hebrew University, Israel. eran.malach@mail.huji.ac.il
Pseudocode No The paper describes the training algorithm in text and mathematical formulas but does not include a dedicated pseudocode or algorithm block.
Open Source Code No The paper does not provide any concrete access information for open-source code related to the methodology described.
Open Datasets Yes To validate that our theoretical results apply to a more realistic setting, we perform an experiment that imitates the parity problem using the MNIST dataset.
Dataset Splits No The paper mentions using the MNIST dataset but does not explicitly provide specific details on how the dataset was split into training, validation, and test sets, nor does it refer to standard predefined splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions 'Ada Delta optimizer' but does not provide specific software names with version numbers.
Experiment Setup Yes Our neural-network architecture is a one-hidden layer network with Re LU activation and 512 neurons in the hidden layer. ... All models are trained with Ada Delta optimizer, for 20 epochs, with batch size 128.