reproducibilityindex.ai

Principal Component Projection Without Principal Component Analysis

Authors: Roy Frostig, Cameron Musco, Christopher Musco, Aaron Sidford

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conclude with an empirical evaluation of PC-PROC and RIDGE-PCR (Algorithms 1 and 2). Since PCR has already been justified as a statistical technique, we focus on showing that, with few iterations, our algorithm recovers an accurate approximation to Aλb and PAλy. We begin with synthetic data, which lets us control the spectral gap γ that dominates our iteration bounds (see Theorem 3.2). Data is generated randomly... As apparent in Figure 2, our algorithm performs very well for regression, even for small γ.
Researcher Affiliation	Collaboration	Roy Frostig RF@CS.STANFORD.EDU Stanford University Cameron Musco CNMUSCO@MIT.EDU Christopher Musco CPMUSCO@MIT.EDU MIT Aaron Sidford ASID@MICROSOFT.COM Microsoft Research, New England
Pseudocode	Yes	Algorithm 1 (PC-PROJ) Principal component projection; Algorithm 2 (RIDGE-PCR) Ridge regression-based PCR
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets	Yes	Finally, we consider a 60K-point regression problem constructed from MNIST classiﬁcation data (Le Cun et al., 2015).
Dataset Splits	No	The paper mentions 'synthetic data' and 'MNIST classification data' but does not specify how these datasets were split into training, validation, or test sets for experiments.
Hardware Specification	No	The paper does not specify any hardware details (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper does not mention any specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions) used for the experiments.
Experiment Setup	Yes	Data is generated randomly by drawing top singular values uniformly from the range [.5(1 + γ), 1] and tail singular values from [0, .5(1 γ)]. λ is set to .5 and A (500 rows, 200 columns) is formed via the SVD U VT where U and V are random bases and contains our random singular values. ... The MNIST principal component regression was run with λ = .01σ2 1.