Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Optimal Sparse Linear Encoders and Sparse PCA
Authors: Malik Magdon-Ismail, Christos Boutsidis
NeurIPS 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We answer both questions by providing the first polynomial-time algorithms to construct optimal sparse linear auto-encoders; additionally, we demonstrate the performance of our algorithms on real data. Our experiments are not exhaustive, but their role is modest: to motivate minimizing loss as the right machine learning objective for sparse encoders (Problem 1). We empirically demonstrate our algorithms against existing state-of-the-art sparse PCA methods. |
| Researcher Affiliation | Academia | Malik Magdon-Ismail Rensselaer Polytechnic Institute, Troy, NY 12211 EMAIL Christos Boutsidis New York, NY EMAIL |
| Pseudocode | Yes | Blackbox algorithm to compute encoder from CSSP. Batch Sparse Linear Encoder Algorithm. Iterative Sparse Linear Encoder Algorithm. |
| Open Source Code | No | The paper does not provide explicit links or statements about the availability of its own source code for the described methodology. |
| Open Datasets | Yes | We use the same data sets used by these prior algorithms (all available in [23]): Pit Props (X R13 13); Colon (X R500 500); Lymphoma (X R500 500). |
| Dataset Splits | Yes | The table below compares the 10-fold cross-validation error Eout for an SVM classifier using features from popular variance maximizing sparse-PCA encoders and our loss minimizing sparse-encoder (k = 6 and r = 7), |
| Hardware Specification | No | The paper does not specify any hardware used for running the experiments. |
| Software Dependencies | No | The paper discusses various algorithms and methods but does not list specific software dependencies with version numbers (e.g., Python, PyTorch versions). |
| Experiment Setup | Yes | The table below compares the 10-fold cross-validation error Eout for an SVM classifier using features from popular variance maximizing sparse-PCA encoders and our loss minimizing sparse-encoder (k = 6 and r = 7). The inputs are X Rn d, the number of components k and the sparsity parameter r. We only show k = 2 in Figure 1. |