reproducibilityindex.ai

Rich Component Analysis

Authors: Rong Ge, James Zou

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show how to integrate RCA with stochastic gradient descent into a meta-algorithm for learning general models, and demonstrate substantial improvement in accuracy on several synthetic and real datasets in both supervised and unsupervised tasks. 5. Experiments In the experiments, we focus on the contrastive learning setting where we are given observations of U = S1 + S2 and V = AS2 + S3. The goal is to estimate the parameters for the S1 distribution. Our approach can also learn the shared component S2 as well as S3. We tested our method in ﬁve settings, where S1 corresponds to: low rank Gaussian (PCA), linear regression, mixture of Gaussians (GMM), logistic regression and the Ising model.
Researcher Affiliation	Collaboration	Rong Ge RONGGE@CS.DUKE.EDU Duke University, Computer Science Department, 308 Research Dr, Durham NC 27708 James Zou JAMESYZOU@GMAIL.COM Microsoft Research, One Memorial Dr, Cambridge MA 01239
Pseudocode	Yes	Algorithm 1 Find Linear
Open Source Code	No	The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We applied RCA to a real dataset of DNA methylation biomarkers. Twenty biomarkers (10 test and 10 control) measured the DNA methylation level (a real number between 0 and 1) at twenty genomic loci across 686 individuals (Zou et al., 2014).
Dataset Splits	No	The paper describes the datasets used and some experimental settings (e.g., '10 dimensional logistic model', '5-by-5 Ising model') but does not specify explicit training, validation, or test dataset splits (e.g., percentages or exact counts) for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications, or cloud computing instance types) used for running the experiments.
Software Dependencies	No	The paper mentions general algorithms and methods like 'stochastic gradient descent' and 'EM' but does not specify any software dependencies or libraries with version numbers required for reproduction.
Experiment Setup	Yes	In all five settings, we let S3 be sampled uniformly from [−1, 1]d, where d is the dimension of S3. ... S1 was set to have a principal component along direction v1, i.e. s1 ∼ N(0, v1v1T + σ2I). S2 was sampled from Unif([−1, 1]d)+v2v2T... S1 is a mixture of d spherical Gaussians in Rd... We use the 4-th order Chebychev polynomial approximation to the SGD of logistic regression as in Section 4.2.