A Combinatorial Perspective on the Optimization of Shallow ReLU Networks
Authors: Michael S Matena, Colin A. Raffel
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We ran experiments comparing batch gradient descent to solving (3) for a randomly chosen vertex on some toy datasets. We present some of our results in fig. 2. |
| Researcher Affiliation | Academia | Michael Matena Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 27599 mmatena@cs.unc.edu Colin Raffel Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 27599 craffel@cs.unc.edu |
| Pseudocode | Yes | Algorithm 1 Exact ERM (Arora et al., 2016) ... Algorithm 2 Greedy Local Search (GLS) Heuristic |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] We have included this in the supplemental material. |
| Open Datasets | Yes | We also created toy binary classification datasets from MNIST (Le Cun et al., 2010) and Fashion MNIST (Xiao et al., 2017) |
| Dataset Splits | No | The paper mentions 'training sets' but does not explicitly provide details about train/validation/test splits or mention a 'validation' set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or detailed cluster specifications) used for running experiments. The authors' ethics statement acknowledges: 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No]' |
| Software Dependencies | No | The paper mentions software libraries like 'CVXPY', 'ECOS', 'scikit-learn', 'PyTorch', and 'NumPy', but does not specify their version numbers. |
| Experiment Setup | Yes | See appendix H for details of the training procedures and for results on more d, mgen and d, N pairs. ... We used a batch size of 128 and a learning rate of 10−3. We trained for 1000 epochs using the Adam optimizer. |