reproducibilityindex.ai

Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression

Authors: Mo Zhou, Rong Ge

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we run synthetic experiments to verify our theoretical results. We choose d from 100 to 10^6 and set n = 3 sqrt(d).
Researcher Affiliation	Academia	Department of Computer Science, Duke University, Durham, NC, US.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete statement or link regarding the availability of source code for the methodology described.
Open Datasets	No	The paper states: "data x_i ~ N(0, I) sampled from Gaussian distribution". This indicates synthetic data generation, but no link, DOI, or formal citation for a publicly available dataset is provided.
Dataset Splits	No	The paper describes synthetic data generation and mentions "training loss" and "test loss" but does not specify explicit training, validation, or test dataset splits (e.g., percentages, counts, or predefined splits).
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions "Lasso (implemented in sklearn)" but does not provide specific version numbers for sklearn or any other software dependencies, which is necessary for reproducibility.
Experiment Setup	Yes	We set lambda = 100d/sigma*n log(n)( sqrt(log(d)/n) + sqrt(n/d)) and run gradient descent with stepsize eta = 10^-6 until training loss reaches 10^-4.