reproducibilityindex.ai

Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems

Authors: Itay Safran, Ohad Shamir

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we empirically verify the phenomenon that our theory predicts by running simulations on the constructions used in Equations (1) and (2).5 Our lower bounds show that to beat withreplacement on those problem instances, the number of epochs must be in the order of magnitude of the condition number. However, since our analysis is asymptotic in its nature, ignoring constants and logarithmic terms, it is not clear to what extent this phenomenon can be observed in practice. To investigate this, we averaged the performance of 100 SGD instantiations over various values of k ranging from k = 40 up to k = 2, 000, where for each value a suitable step size of η = log(nk) / (λnk) was chosen. Our problem parameters were chosen to satisfy n = 500, G = 1, λ = 1 and λmax = 200. Note that this implies that the condition number equals 200 in the constructed problem instances. Each SGD instantiation was initialized from x0 = G/(2λ), G/(2λmax), which satisfies both Assumptions 1 and 2. We used Python 3.6 in our code, which is freely available at https://github.com/Itay Safran/SGD_condition_number. Our code was run on a single machine with an Intel Core i7-8550U 1.80GHz processor.
Researcher Affiliation	Academia	Itay Safran Princeton University isafran@princeton.edu Ohad Shamir Weizmann Institute of Science ohad.shamir@weizmann.ac.il
Pseudocode	No	The paper describes algorithms and mathematical formulations but does not include a distinct pseudocode block or algorithm listing.
Open Source Code	Yes	We used Python 3.6 in our code, which is freely available at https://github.com/Itay Safran/SGD_condition_number.
Open Datasets	No	The paper does not use traditional datasets but rather mathematical constructions for its experiments, based on equations (1) and (2). Thus, there's no externally available dataset to link.
Dataset Splits	No	The paper describes the number of SGD instantiations and parameter choices for its simulations but does not specify training, validation, or test data splits, as it uses constructed functions rather than empirical datasets with explicit splits.
Hardware Specification	Yes	Our code was run on a single machine with an Intel Core i7-8550U 1.80GHz processor.
Software Dependencies	Yes	We used Python 3.6 in our code
Experiment Setup	Yes	To investigate this, we averaged the performance of 100 SGD instantiations over various values of k ranging from k = 40 up to k = 2, 000, where for each value a suitable step size of η = log(nk) / (λnk) was chosen. Our problem parameters were chosen to satisfy n = 500, G = 1, λ = 1 and λmax = 200. Each SGD instantiation was initialized from x0 = G/(2λ), G/(2λmax).