reproducibilityindex.ai

Approximating Sparse PCA from Incomplete Data

Authors: ABHISEK KUNDU, Petros Drineas, Malik Magdon-Ismail

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our algorithms extensively on image, text, biological and ﬁnancial data.
Researcher Affiliation	Academia	Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, kundua2@rpi.edu. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, drinep@cs.rpi.edu. Department of Computer Science, Rensselaer Polytechnic Institute, Troy, NY, magdon@cs.rpi.edu.
Pseudocode	Yes	Algorithm 1 Hybrid (ℓ1, ℓ2)-Element Sampling
Open Source Code	No	No explicit statement or link providing concrete access to the source code for the methodology described in this paper was found.
Open Datasets	Yes	Digit Data (m = 2313, n = 256): We use the [7] handwritten zip-code digit images (300 pixels/inch in 8-bit gray scale). Tech TC Data (m = 139, n = 15170): We use the Technion Repository of Text Categorization Dataset (Tech TC, see [6]) from the Open Directory Project (ODP). Gene Expression Data (m = 107, n = 22215): We use GSE10072 gene expression data for lung cancer from the NCBI Gene Expression Omnibus database.
Dataset Splits	No	No specific details on training, validation, or test dataset splits (e.g., percentages, sample counts, or explicit mention of validation sets) were provided.
Hardware Specification	No	No specific hardware details (such as GPU/CPU models, memory, or cloud instance types) used for running the experiments were provided.
Software Dependencies	No	The paper mentions using the 'Spasm toolbox' but does not provide specific version numbers for Spasm or any other software dependencies like Matlab.
Experiment Setup	Yes	We sample approximately 7% of the elements from the centered data using (ℓ1, ℓ2)-sampling, as well as uniform sampling. The performance for small r is shown in Table 1, including the running time τ. For this data, f(Gmax,r)/f(Gsp,r) 0.23 (r = 10). We sample approximately 5% of the elements from the centered data using our (ℓ1, ℓ2)-sampling, as well as uniform sampling.