reproducibilityindex.ai

A Comprehensively Tight Analysis of Gradient Descent for PCA

Authors: Zhiqiang Xu, Ping Li

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments are conducted to confirm our findings as well. The purpose of the experimental study for corroborating our findings in above sections is twofold.
Researcher Affiliation	Industry	Zhiqiang Xu, Ping Li Cognitive Computing Lab Baidu Research No. 10 Xibeiwang East Road, Beijing 100193, China 10900 NE 8th St. Bellevue, Washington 98004, USA {xuzhiqiang04,liping11}@baidu.com
Pseudocode	Yes	Algorithm 1 VR-PCA (Oja). Algorithm 2 VR-PCA (Krasulina).
Open Source Code	No	The paper states "We implemented the PGD with η= 1ρ, ηt = 0.6 x t Axt , 1 x t Axt , 1.6 x t Axt , and RGD with step-size schemes η = 1 λ1 λn , ηt = 0.6 x t Axt , 1 x t Axt , 1.6 x t Axt , in MATLAB." but does not provide any public link or explicit statement about releasing the code for the described methodology.
Open Datasets	Yes	We also experiment on two real datasets4 Schenk that was used in [19, 5] and GHS_indef. (footnote 4: https://sparse.tamu.edu/) The common PCA datasets are used and summarized in Table 4. (Table 4 lists MMILL, JW11, MNIST).
Dataset Splits	No	The paper does not provide specific training/validation/test dataset splits. It mentions "train", "validation", "test" as schema fields, but the text of the paper itself does not contain explicit validation split information like percentages or counts.
Hardware Specification	Yes	Experiments were done on a laptop (dual-core 2.30GHZ CPU and 8GB RAM).
Software Dependencies	No	The paper states "We implemented the PGD... in MATLAB." but does not provide a specific version number for MATLAB or any other software dependencies.
Experiment Setup	Yes	We implemented the PGD with η= 1ρ, ηt = 0.6 x t Axt , 1 x t Axt , 1.6 x t Axt , and RGD with step-size schemes η = 1 λ1 λn , ηt = 0.6 x t Axt , 1 x t Axt , 1.6 x t Axt , in MATLAB. All the methods start from the same random initial point x0 and run for T = 100 iterations. We use b = 100. Note that we only update the learning rate at the epoch level and keep it unchanged within each epoch, similar to the case of the computation of the full gradient.