Exponential Separations in Symmetric Neural Networks

Authors: Aaron Zweig, Joan Bruna

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this work we demonstrate a novel separation between symmetric neural network architectures. Specifically, we consider the Relational Network [21] architecture as a natural generalization of the Deep Sets [32] architecture, and study their representational gap. Under the restriction to analytic activation functions, we construct a symmetric function acting on sets of size N with elements in dimension D, which can be efficiently approximated by the former architecture, but provably requires width exponential in N and D for the latter.
Researcher Affiliation Academia Aaron Zweig Courant Institute of Mathematical Sciences New York University az831@nyu.edu Joan Bruna Center for Data Science New York University bruna@cims.nyu.edu
Pseudocode No The paper does not contain any sections or figures explicitly labeled as
Open Source Code No The checklist at the end of the paper states 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]'.
Open Datasets No The paper is theoretical and does not mention any datasets used for training. The author checklist explicitly marks 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]'.
Dataset Splits No The paper is theoretical and does not describe any dataset splits (train, validation, test). The author checklist explicitly marks 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]'.
Hardware Specification No The paper is theoretical and does not mention any hardware specifications used for experiments. The author checklist explicitly marks 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A]'.
Software Dependencies No The paper is theoretical and does not describe any specific software dependencies with version numbers for experimental reproducibility. The author checklist explicitly marks 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [N/A]'.
Experiment Setup No The paper is theoretical and does not describe any experimental setup details such as hyperparameters or training settings. The author checklist explicitly marks 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [N/A]'.