A Functional Perspective on Learning Symmetric Functions with Neural Networks
Authors: Aaron Zweig, Joan Bruna
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We detail several experiments with the finite instantiation of measure networks in Section 6. (Roadmap section) and 6. Experiments, 6.1. Symmetric Function Approximation, Figure 1: Test Error for d = 10 on the neural architectures of Section 3.1, Table 2: Mean squared test error for robust mean estimation among the finite model instantiations and baselines. |
| Researcher Affiliation | Academia | 1Courant Institute of Mathematical Sciences, New York University, New York 2Center for Data Science, New York University, New York. |
| Pseudocode | No | The paper describes network architectures and formulations mathematically (e.g., equations 4, 5, and the finite-width implementations), but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks, nor structured, code-like steps for a procedure. |
| Open Source Code | No | The paper does not provide any statement about making the source code available, nor does it include a link to a code repository. |
| Open Datasets | Yes | We additionally consider an applied experiment on a variant of MNIST to observe how the finite-width implementations perform on real-world data, by first mapping images to point clouds. |
| Dataset Splits | No | The paper states 'We choose to train with N = 4, i.e. all networks train on input sets of size 4, and test on sets of varying size.' (Section 6.1). It defines the training and testing sets but does not specify a separate validation split or its details. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, memory, or specific computing environments) used for running the experiments. |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers (e.g., Python, specific machine learning libraries like PyTorch or TensorFlow, along with their versions) that would be needed for reproducibility. |
| Experiment Setup | Yes | Experimental Setup: We instantiate our three function classes in the finite network setting, as outlined in Table 1. We use input dimension d = 10. For the finite realization of S1, we use first hidden layer size m = 100 and second hidden layer size h = 100. Crucially, after fixing the finite architecture representing S1, we scale up the width by 10 for the models with frozen weights. That is, the first hidden layer in S2, and both hidden layers in S3, have width equal to 1000. Increasing the width makes the S2 and S3 models strictly more powerful, and this setup allows us to inspect whether a larger number of random kernel features can compensate for a smaller, trained weight in approximation. For each model, we use its associated functional norm for regularization. |