Rich Component Analysis
Authors: Rong Ge, James Zou
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show how to integrate RCA with stochastic gradient descent into a meta-algorithm for learning general models, and demonstrate substantial improvement in accuracy on several synthetic and real datasets in both supervised and unsupervised tasks. 5. Experiments In the experiments, we focus on the contrastive learning setting where we are given observations of U = S1 + S2 and V = AS2 + S3. The goal is to estimate the parameters for the S1 distribution. Our approach can also learn the shared component S2 as well as S3. We tested our method in five settings, where S1 corresponds to: low rank Gaussian (PCA), linear regression, mixture of Gaussians (GMM), logistic regression and the Ising model. |
| Researcher Affiliation | Collaboration | Rong Ge RONGGE@CS.DUKE.EDU Duke University, Computer Science Department, 308 Research Dr, Durham NC 27708 James Zou JAMESYZOU@GMAIL.COM Microsoft Research, One Memorial Dr, Cambridge MA 01239 |
| Pseudocode | Yes | Algorithm 1 Find Linear |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We applied RCA to a real dataset of DNA methylation biomarkers. Twenty biomarkers (10 test and 10 control) measured the DNA methylation level (a real number between 0 and 1) at twenty genomic loci across 686 individuals (Zou et al., 2014). |
| Dataset Splits | No | The paper describes the datasets used and some experimental settings (e.g., '10 dimensional logistic model', '5-by-5 Ising model') but does not specify explicit training, validation, or test dataset splits (e.g., percentages or exact counts) for reproduction. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory specifications, or cloud computing instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions general algorithms and methods like 'stochastic gradient descent' and 'EM' but does not specify any software dependencies or libraries with version numbers required for reproduction. |
| Experiment Setup | Yes | In all five settings, we let S3 be sampled uniformly from [−1, 1]d, where d is the dimension of S3. ... S1 was set to have a principal component along direction v1, i.e. s1 ∼ N(0, v1v1T + σ2I). S2 was sampled from Unif([−1, 1]d)+v2v2T... S1 is a mixture of d spherical Gaussians in Rd... We use the 4-th order Chebychev polynomial approximation to the SGD of logistic regression as in Section 4.2. |