Can Gaussian Sketching Converge Faster on a Preconditioned Landscape?

Authors: Yilong Wang, Haishan Ye, Guang Dai, Ivor Tsang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, our experimental results substantiate the effectiveness and efficiency of our algorithm. This section is dedicated to the empirical validation of our algorithm s effectiveness and superiority. Our experiments will focus on the quadratic minimization problem, whose objective function adheres to the form delineated in Eq. (1), characterized by min x Rd F(x) = 1 2x T Mx b T x + φ(x)
Researcher Affiliation Collaboration Yilong Wang 1 2 Haishan Ye 1 3 Guang Dai 3 Ivor W. Tsang 4 5 1Center for Intelligent Decision-Making and Machine Learning, School of Management, Xi an Jiaotong University, China. 2This work was completed during the internship at SGIT AI Lab, State Grid Corporation of China. 3SGIT AI Lab, State Grid Corporation of China. 4CFAR and IHPC, Agency for Science, Technology and Research (A*STAR), Singapore 5College of Computing and Data Science, NTU, Singapore.
Pseudocode Yes Algorithm 1 GSGD:Gaussian Sketched Gradient Descent Initialize: x0, h0 N(0, Id), stepsize η > 0 for k = 0, 1, do Sample u N(0, Id) gk = hk + uu ( f(xk) hk) hk+1 = hk + uu d+2 ( f(xk) hk) xk+1 = Proxαφ(xk ηgk) end for
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets No The paper mentions using the 'Appliances Energy Prediction' dataset but does not provide a direct link, DOI, repository name, or formal citation for accessing it.
Dataset Splits No The paper does not specify exact train/validation/test split percentages or sample counts for the datasets used in the experiments.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU/CPU models or memory specifications.
Software Dependencies No The paper does not provide specific version numbers for any software components, libraries, or solvers used in the experiments.
Experiment Setup No The paper mentions 'properly choose the step sizes' and that they 'should be proportional to O(1/tr(M)) and O(1/(dλ(M)))' but does not provide specific numerical values for hyperparameters or other detailed training configurations.