Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?
Authors: Kaiqi Zhang, Yu-Xiang Wang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 EXPERIMENT We empirically compare a parallel neural network (PNN) and a vanilla Re LU neural network (NN) with smoothing spline, trend filtering (TF) (Tibshirani, 2014), and wavelet denoising. [...] The results are shown in Figure 3. |
| Researcher Affiliation | Academia | Kaiqi Zhang Department of Electrical and Computer Engineering University of California, Santa Barbara kzhang70@ucsb.edu Yu-Xiang Wang Department of Computer Science University of California, Santa Barbara yuxiangw@cs.ucsb.edu |
| Pseudocode | No | No explicit pseudocode or algorithm block was found in the paper. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of open-source code for the described methodology, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We use two target functions: a Doppler function whose frequency is decreasing(Figure 3(a)-(c)(h)), and a combination of piecewise linear function and piecewise cubic function, or vary function (Figure 3(d)-(f)(i)). H.1 TARGET FUNCTIONS The doppler function used in Figure 3(d)-(f) is f(x) = sin(4/(x + 0.01)) + 1.5. The vary function used in Figure 3(g)-(i) is f(x) = M1(x/0.01) + M1((x 0.02)/0.02) + M1((x 0.06)/0.03) + M1((x 0.12)/0.04) + M3((x 0.2)/0.02) + M3((x 0.28)/0.04) + M3((x 0.44)/0.06) + M3((x 0.68)/0.08), where M1, M3 are first and third order Cardinal B-spline bases functions respectively. |
| Dataset Splits | No | The paper describes using a 'training dataset' (Dn) and calculating MSE, but it does not specify explicit train/validation/test splits, nor does it mention a validation set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU, GPU models, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions software like CVXPY, MOSEK, and R (for smooth.spline), but it does not specify their version numbers, which are necessary for reproducible ancillary software details. |
| Experiment Setup | Yes | In the piecewise polynomial function ( vary ) experiment, the depth of the PNN L = 10, the width of each subnetwork w = 10, and the model contains M = 500 subnetworks. [...] We used Adam optimizer with learning rate of 10-3. We first train the neural network layer by layer without weight decay. |