Permutation-Based SGD: Is Random Optimal?

Authors: Shashank Rajput, Kangwook Lee, Dimitris Papailiopoulos

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We summarize FLIPFLOP s convergence rates in Table 1 and report the results of numerical verification in Section 6.2.
Researcher Affiliation Academia Shashank Rajput Kangwook Lee University of Wisconsin-Madison Dimitris Papailiopoulos
Pseudocode Yes Algorithm 1 Permutation-based SGD variants
Open Source Code Yes The code for all the experiments can be found at https://github.com/shashankrajput/flipflop .
Open Datasets No We randomly sample n = 800 points from a 100-dimensional sphere. Let the points be xi for i = 1, . . . , n. Then, their mean is the solution to the following quadratic problem : arg minx F(x) = 1/n sum_{i=1 to n} ||x - xi||^2. We solve this problem by using the given algorithms.
Dataset Splits No The paper does not provide specific details about training, validation, or test dataset splits. The experiments involve optimizing functions rather than typical supervised learning tasks with predefined data splits.
Hardware Specification No The paper does not explicitly describe the hardware used for running its experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes We set n = 800, so that n << K and hence the higher order terms of K dominate the convergence rates. Note that both the axes are in logarithmic scale. [...] with step size α = 10 log(n K) / µn K (Theorem 4).