Approximating Sparse PCA from Incomplete Data
Authors: ABHISEK KUNDU, Petros Drineas, Malik Magdon-Ismail
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our algorithms extensively on image, text, biological and financial data. |
| Researcher Affiliation | Academia | Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, kundua2@rpi.edu. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, drinep@cs.rpi.edu. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, magdon@cs.rpi.edu. |
| Pseudocode | Yes | Algorithm 1 Hybrid (ℓ1, ℓ2)-Element Sampling |
| Open Source Code | No | No explicit statement or link providing concrete access to the source code for the methodology described in this paper was found. |
| Open Datasets | Yes | Digit Data (m = 2313, n = 256): We use the [7] handwritten zip-code digit images (300 pixels/inch in 8-bit gray scale). Tech TC Data (m = 139, n = 15170): We use the Technion Repository of Text Categorization Dataset (Tech TC, see [6]) from the Open Directory Project (ODP). Gene Expression Data (m = 107, n = 22215): We use GSE10072 gene expression data for lung cancer from the NCBI Gene Expression Omnibus database. |
| Dataset Splits | No | No specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit mention of validation sets) were provided. |
| Hardware Specification | No | No specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running the experiments were provided. |
| Software Dependencies | No | The paper mentions using the 'Spasm toolbox' but does not provide specific version numbers for Spasm or any other software dependencies like Matlab. |
| Experiment Setup | Yes | We sample approximately 7% of the elements from the centered data using (ℓ1, ℓ2)-sampling, as well as uniform sampling. The performance for small r is shown in Table 1, including the running time τ. For this data, f(Gmax,r)/f(Gsp,r) 0.23 (r = 10). We sample approximately 5% of the elements from the centered data using our (ℓ1, ℓ2)-sampling, as well as uniform sampling. |