Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Are deep ResNets provably better than linear predictors?
Authors: Chulhee Yun, Suvrit Sra, Ali Jadbabaie
NeurIPS 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The first example shows that there exists a family of datasets on which the squared error loss attained by a fully-connected neural network is at best the linear least squares model, whereas a Res Net attains a strictly better loss than the linear model. This highlights that the guarantee on the risk value of local minima is indeed special to residual networks. ... Consider the following dataset with six data points, where ρ > 0 is a fixed constant: X = [0 1 2 3 4 5] , Y = [ ρ 1 ρ 2 + ρ 3 ρ 4 + ρ 5 + ρ] . ... Using the optimal w and c, a straightforward calculation gives R2(θ2) = ρ2(12ρ2+82ρ+215) 21ρ2+156ρ+420 , and it is strictly smaller than 8ρ2/15 on ρ (0, p 3.2 Representations by residual block outputs do not improve monotonically Consider a dataset X = [1 2.5 3] and Y = [1 3 2] |
| Researcher Affiliation | Academia | Chulhee Yun MIT Cambridge, MA 02139 EMAIL Suvrit Sra MIT Cambridge, MA 02139 EMAIL Ali Jadbabaie MIT Cambridge, MA 02139 EMAIL |
| Pseudocode | No | The paper describes mathematical formulations and network architectures but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any links to source code or explicitly state that code for the methodology is being released. |
| Open Datasets | No | The paper defines small, synthetic datasets for its motivating examples (e.g., 'Consider the following dataset with six data points...', 'Consider a dataset X = [1 2.5 3] and Y = [1 3 2]'), but these are custom-defined for illustrative purposes within the paper and are not publicly available datasets in the sense of being accessible via a link, DOI, or being an established benchmark. |
| Dataset Splits | No | The paper presents theoretical analysis with illustrative examples using small, custom-defined datasets, but it does not specify training, validation, or test dataset splits typically used in machine learning experiments. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for any computations or simulations. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or programming languages used. |
| Experiment Setup | Yes | For the motivating examples, the paper explicitly sets specific parameter values for the networks being analyzed: 'Choose v = 0.5ρ, u = 1, and b = 3.' and 'v1 = 1, u1 = 1, b1 = 2, v2 = 4, u2 = 1, b2 = 3.5, w = 1, c = 0.' These values define the specific configuration of the models used in their demonstrations. |