Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Rethinking PCA Through Duality
Authors: Jan Quan, Johan Suykens, Panagiotis Patrinos
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experiments We compare several simple gradient-based methods on various formulations from Theorem 3.1 and Proposition 3.3. All experiments are implemented in Julia 1.11.1 on a machine with an AMD Ryzen 7 Pro 5850U processor and 32 GB RAM. The timings are taken using Benchmark Tools.jl [17]. Our results are presented in Table 1. The code for reproducing all experiments is publicly available2. For the constrained problems, we apply both the proximal gradient (PG) method and Zero FPR [70] using the implementation from [65]. ... Additional experiments showcasing the performance of these first-order methods on various formulations as well as toy experiments for our robust kernel PCA can be found in Appendix F. |
| Researcher Affiliation | Academia | Jan Quan ESAT-STADIUS & Leuven.AI KU Leuven, Belgium EMAIL Johan Suykens ESAT-STADIUS & Leuven.AI KU Leuven, Belgium EMAIL Panagiotis Patrinos ESAT-STADIUS & Leuven.AI KU Leuven, Belgium EMAIL |
| Pseudocode | Yes | Algorithm 1 (DCA) for (d)... Algorithm 2 (DCA) for (g)... Algorithm 3 (DCA) for Proposition 3.3 (primal)... Algorithm 4 (DCA) for Proposition 3.3 (dual)... Algorithm 5 (DCA) for Proposition 4.1 (primal)... Algorithm 6 (DCA) for Proposition 4.1 (dual)... |
| Open Source Code | Yes | The code for reproducing all experiments is publicly available2. 2https://github.com/Jan Q/pca-duality |
| Open Datasets | Yes | Table 7 shows the timing results of the best performing methods on the MNIST dataset [22] (N = 60000, d = 784) with a tolerance of ε = 10 3. We observe the same behavior as in the synthetic experiments: the generic first-order methods are faster than the classical eigensolvers in this setting. Table 8 shows the timing results of the best performing methods on the 100k top words from the 2024 Wikipedia + Gigaword 5, 50d GloVe word embedding dataset [57, 16] (N = 100000, d = 50) with a tolerance of ε = 10 3. |
| Dataset Splits | Yes | Robust (kernel) PCA Consider the MNIST dataset [22] with a train-test split of 80-20. To verify the robustness properties of the robust PCA formulation in Section 4, we contaminate 15% of the training data with heavy Gaussian noise (σ = 15) and leave the test set untouched. |
| Hardware Specification | Yes | All experiments are implemented in Julia 1.11.1 on a machine with an AMD Ryzen 7 Pro 5850U processor and 32 GB RAM. |
| Software Dependencies | Yes | All experiments are implemented in Julia 1.11.1 on a machine with an AMD Ryzen 7 Pro 5850U processor and 32 GB RAM. The timings are taken using Benchmark Tools.jl [17]. |
| Experiment Setup | Yes | Table 1: Timing results for various methods applied to PCA formulations. The problem setting (N, d, s, ϵ) denotes a data matrix X RN d with entries sampled from a standard normal distribution. s denotes the computed number of principal components and ε the stopping criterion tolerance. All timings are in milliseconds, and timings longer than 5 seconds are not displayed. Robust (kernel) PCA Consider the MNIST dataset [22] with a train-test split of 80-20. To verify the robustness properties of the robust PCA formulation in Section 4, we contaminate 15% of the training data with heavy Gaussian noise (σ = 15) and leave the test set untouched. ... For each of these settings, we evaluate the reconstruction error on the noncontaminated test set. To this end, for each of the settings, we train a small multilayer perceptron classifier (1 hidden layer with 20 neurons) on these extracted features. |