reproducibilityindex.ai

Characterizing Implicit Bias in Terms of Optimization Geometry

Authors: Suriya Gunasekar, Jason Lee, Daniel Soudry, Nathan Srebro

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The empirical results for this problem in Figure 1c clearly show that even for ℓp norms where the . 2 p is smooth and strongly convex, the corresponding steepest descent converges to a global minimum that depends on the step size. Figure 1: Dependence of implicit bias on step size and momentum: In (a) (c)... (a) Mirror descent with primal momentum (Example 2): the global minimum that eq. (8) converges to depends on the momentum parameters the sub-plots contain the trajectories of eq. (8) for different choices of βt = β and γt = γ; (b) Natural gradient descent (Example 3): for different step sizes ηt = η, eq. (9) converges to different global minima. Here, η was chosen to be small enough to ensure w(t) dom(ψ).
Researcher Affiliation	Academia	1 TTI Chicago, USA 2 USC Los Angeles, USA 3 Technion, Israel. Correspondence to: Suriya Gunasekar <suriya@ttic.edu>, Jason Lee <jasonlee@marshall.usc.edu>, Daniel Soudry <daniel.soudry@gmail.com>, Nathan Srebro <nati@ttic.edu>.
Pseudocode	No	The paper describes algorithms using mathematical equations (e.g., eq. 3, 4, 5, 9, 11, 13) but does not include any blocks explicitly labeled as "Pseudocode" or "Algorithm".
Open Source Code	No	The paper does not provide any explicit statement about releasing source code for the methodology or a link to a code repository.
Open Datasets	No	The paper uses simple, illustrative datasets for its examples, such as "dataset {(x1 = [1, 2], y1 = 1)}" and "dataset {(x1 = [1, 1, 1], y1 = 1), (x1 = [1, 2, 0], y1 = 10)}", but does not provide concrete access information (link, DOI, formal citation) for any publicly available or open dataset.
Dataset Splits	No	The paper does not provide specific dataset split information (e.g., percentages, sample counts, or citations to predefined splits) for training, validation, or testing. The examples use small, custom-defined data points for theoretical demonstration.
Hardware Specification	No	The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory) used to run its examples or experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the demonstrations or experiments.
Experiment Setup	Yes	Figure 1: Dependence of implicit bias on step size and momentum: In (a) (c), the blue line denotes the set G of global minima for the respective examples... (a) Mirror descent with primal momentum (Example 2): the global minimum that eq. (8) converges to depends on the momentum parameters the sub-plots contain the trajectories of eq. (8) for different choices of βt = β and γt = γ; (b) Natural gradient descent (Example 3): for different step sizes ηt = η, eq. (9) converges to different global minima. Here, η was chosen to be small enough to ensure w(t) dom(ψ). (c) Steepest descent w.r.t . 4/3 (Example 4): the global minimum to which eq. (11) converges depends on η. Here w(0) = [0, 0, 0]...