Nonparametric Neural Networks
Authors: George Philipp, Jaime G. Carbonell
ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated our framework using three standard benchmark datasets the mnist dataset, the rectangles images dataset and the convex dataset (Bergstra & Bengio, 2012). We started by training nonparametric networks. Through preliminary experiments, we determined a good starting angular step size for all datasets. We chose to start with αφ = 30 and repeatedly divided αφ by 3 when the validation error stopped improving. By varying the random seed, we trained 10 nets each for several values of the regularization parameter λ per dataset and then chose a typical representative from among those 10 trained nets. Results are shown in black in figure 2. |
| Researcher Affiliation | Academia | George Philipp, Jaime G. Carbonell Carnegie Mellon University Pittsburgh, PA 15213, USA george.philipp@email.de; jgc@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1: Ada Rad with ℓ2 fan-in regularizer and the unit addition / removal scheme used in this paper in its most instructive (bot not fastest) order of computation. |
| Open Source Code | No | The paper does not include an explicit statement or link for open-source code related to the described methodology. |
| Open Datasets | Yes | We evaluated our framework using three standard benchmark datasets the mnist dataset, the rectangles images dataset and the convex dataset (Bergstra & Bengio, 2012). ... This was the poker dataset http://www.openml.org/d/354. |
| Dataset Splits | Yes | train-valid split (MNIST) 50.000 10.000 train-valid split (rectangles images) 10.000 2.000 train-valid split (convex) 7.000 1.000 train-valid-test split (poker) 800.000 125.010 100.000 |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for the experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | Table 3: Hyperparameters and related choices. ... number of hidden layers (not poker) 2 ... αr: radial step size for Ada Rad (not poker) 1 50λ ... ν: unit addition rate for Ada Rad 1 ... batch size 1000 |