reproducibilityindex.ai

Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron

Authors: Jingfeng Wu, Difan Zou, Zixiang Chen, Vladimir Braverman, Quanquan Gu, Sham M. Kakade

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulations. Furthermore, we empirically compare the performance of (GLM-tron) and (SGD) for ReLU regression with symmetric Bernoulli data. Simulation results are presented in Figure 1. In the well-specified setting, Figures 1(a) and 1(b) show that the excess risk of (GLM-tron) is no worse than that of (SGD), even when both algorithms are tuned with their hyperparameters (initial stepsizes) respectively. This verifies our Theorem 6.2. In the noiseless setting, Figure 1(c) clearly illustrates that (SGD) can converge to a critical point with constant risk, while (GLM-tron) successfully recovers the true parameters w . This verifies our Theorem 6.3.
Researcher Affiliation	Academia	1Department of Computer Science, Johns Hopkins University 2Department of Computer Science, The University of Hong Kong 3Department of Computer Science, University of California, Los Angeles 4Department of Computer Science, Rice University 5Department of Computer Science and Department of Statistics, Harvard University.
Pseudocode	No	The paper describes the SGD and GLM-tron update rules with mathematical equations but does not present them in a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not contain any explicit statements about releasing source code or links to code repositories for the described methodology.
Open Datasets	No	The paper uses synthetic data models like "symmetric Bernoulli distribution" and "Gaussian Distribution" for its simulations, specifying how data is generated (e.g., "P{x = ei} = P{x = -ei} = λi/2"). It does not use or provide access to a pre-existing public dataset.
Dataset Splits	No	The paper does not explicitly mention training, validation, or test dataset splits. It discusses using N i.i.d. samples for algorithms and evaluating excess risk.
Hardware Specification	No	The paper does not provide any specific details about the hardware (CPU, GPU models, memory, etc.) used to run the experiments or simulations.
Software Dependencies	No	The paper does not mention any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or other libraries).
Experiment Setup	Yes	For each algorithm and each sample size, we do a grid search on the initial stepsize γ0 {0.5, 0.25, 0.1, 0.075, 0.05, 0.025, 0.01} and report the best excess risk. The plots are averaged over 20 independent runs.