reproducibilityindex.ai

Single Pass PCA of Matrix Products

Authors: Shanshan Wu, Srinadh Bhojanapalli, Sujay Sanghavi, Alexandros G. Dimakis

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	in addition we also provide results from an Apache Spark implementation1 that shows better computational and statistical performance on real-world and synthetic evaluation datasets.
Researcher Affiliation	Academia	Shanshan Wu The University of Texas at Austin shanshan@utexas.edu Srinadh Bhojanapalli Toyota Technological Institute at Chicago srinadh@ttic.edu Sujay Sanghavi The University of Texas at Austin sanghavi@mail.utexas.edu Alexandros G. Dimakis The University of Texas at Austin dimakis@austin.utexas.edu
Pseudocode	Yes	Algorithm 1 SMP-PCA: Streaming Matrix Product PCA
Open Source Code	Yes	The source code is available at [18].S. Wu, S. Bhojanapalli, S. Sanghavi, and A. Dimakis. Github repository for "single-pass pca of matrix products". https://github.com/wushanshan/Matrix Product PCA, 2016.
Open Datasets	Yes	We test our algorithm on synthetic datasets and three real datasets: SIFT10K [9], NIPS-BW [11], and URL-reputation [12].
Dataset Splits	No	The paper mentions using specific datasets (SIFT10K, NIPS-BW, URL-reputation) but does not provide explicit details on how these datasets were split into training, validation, or test sets, nor does it specify proportions or sample counts for each split.
Hardware Specification	Yes	using a 150GB synthetic dataset on m3.2xlarge Amazon EC2 instances6. ... 6Each machine has 8 cores, 30GB memory, and 2 80GB SSD.
Software Dependencies	Yes	We implement our SMP-PCA in Apache Spark 1.6.2 [19].
Experiment Setup	Yes	For all rest experiments, unless otherwise speciﬁed, we set r = 5, T = 10, and m as 4nr log n.