Spurious Valleys and Clustering Behavior of Neural Networks
Authors: Samuele Pollaci
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | By using Sage Math (Stein et al., 2020), we compute the polynomial defining the cut loss landscape HTδ. Then we can use Groebner basis and elimination on Magma (Bosma et al., 1997) to find the polynomial e which defines the projection of Z f x, f y , f z onto the error axis. The roots of e are the points of STδ. When plotting the points of STδ (as in Figures 4 and 5), we apply an absolute value to get real positive numbers. After multiple trials with different values of δ and , some interesting empirical observations can be formulated: The data were collected by conducting 21 trials on Sage Math (Stein et al., 2020), using the NN architecture described in Example 4.1, with N = 10, = 100, and δ(n) := 10n+2, for n [0, 20] Z. |
| Researcher Affiliation | Academia | 1Department of Mathematics, University of Bonn, Bonn, Germany 2Department of Computer Science, Vrije Universiteit Brussel, Brussels, Belgium. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper. It mentions 'adapted the code used in (Li et al., 2018)' but does not state that their specific implementation or contributions are open-source or provide a link. |
| Open Datasets | No | The paper describes generating its own training datasets (e.g., 'input points of Tδ are taken uniformly at random from the real interval [δ , δ]') but does not explicitly state that the dataset is publicly available, provide a link, or cite a well-known public dataset with access information. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for reproduction. It refers to a 'training dataset' but no specific splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Sage Math (Stein et al., 2020)' and 'Magma (Bosma et al., 1997)' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The data were collected by conducting 21 trials on Sage Math (Stein et al., 2020), using the NN architecture described in Example 4.1, with N = 10, = 100, and δ(n) := 10n+2, for n [0, 20] Z. |