reproducibilityindex.ai

Accelerating SGD for Highly Ill-Conditioned Huge-Scale Online Matrix Completion

Authors: Jialun Zhang, Hong-Ming Chiu, Richard Y Zhang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our experiments, we observe a similar acceleration for item-item collaborative ﬁltering on the Movie Lens25M dataset via a pair-wise ranking loss, with 100 million training pairs and 10 million testing pairs.
Researcher Affiliation	Academia	Gavin Zhang University of Illinois at Urbana Champaign jialun2@illinois.edu Hong-Ming Chiu University of Illinois at Urbana Champaign hmchiu2@illinois.edu Richard Y. Zhang University of Illinois at Urbana Champaign ryz@illinois.edu
Pseudocode	No	The paper describes the update equations for SGD and Scaled SGD (equations 1 and 2) but does not provide formal pseudocode blocks or algorithms labeled as such.
Open Source Code	Yes	See supporting code at https://github.com/Hong-Ming/Scaled SGD. The code for all experiments are available at https://github.com/Hong-Ming/Scaled SGD.
Open Datasets	Yes	In our experiments, we observe a similar acceleration for item-item collaborative ﬁltering on the Movie Lens25M dataset via a pair-wise ranking loss, with 100 million training pairs and 10 million testing pairs.
Dataset Splits	No	The paper specifies training and testing sets ('100 million training pairs' and '10 million testing pairs' for Movie Lens25M dataset) but does not explicitly mention a separate validation set or its size/split.
Hardware Specification	No	The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory) used for running the experiments. It only discusses the computational complexity of the algorithms.
Software Dependencies	No	The paper does not specify any software names with version numbers for reproducibility (e.g., Python, PyTorch, TensorFlow, specific libraries, or solvers).
Experiment Setup	Yes	All of our experiments use random Gaussian initializations and an initial P = σ2I. Matrix Completion with RMSE loss... Step-size = 0.3. The CF model is trained using Bayesian Personalized Ranking (BRP) loss [1] on a training set, which consists of 100 million pairwise samples in M.