Fast PCA in 1-D Wasserstein Spaces via B-splines Representation and Metric Projection

Authors: Matteo Pegoraro, Mario Beraha9342-9349

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive simulation studies, we show how our PCA performs similarly to the ones already proposed in the literature while retaining a much smaller computational cost. We apply our method to a real dataset of mortality rates due to Covid-19 in the US, concluding that our analyses are consistent with the current scientific consensus on the disease.
Researcher Affiliation Academia 1 MOX Department of Mathematics, Politecnico di Milano 2 Department of Mathematics, Politecnico di Milano 3 Department of Computer Science, Universit a di Bologna
Pseudocode No The paper describes mathematical formulations and optimization problems but does not include any explicit pseudocode blocks or algorithms.
Open Source Code No The paper mentions a public repository link: “The code is publicly available at https://github.com/ecazelles/ 2017-GPCA-vs-Log PCA-Wasserstein”. However, this link points to code for an existing method (Cazelles et al. 2017), which they used for comparison, not the open-source code for their novel projected PCA methodology described in the paper.
Open Datasets Yes Data are freely available at https://data.cdc.gov/NCHS/ Provisional-COVID-19-Death-Counts-by-Sex-Age-and-S/9bhghcku.
Dataset Splits Yes Each result displays the average 10-fold cross validation accuracy, averaged again over 20 repetitions one standard deviation.
Hardware Specification Yes All experiments were performed on a laptop equipped with a 8-core Intel i7-7700HQ CPU 2.80GHz and 16Gb of RAM.
Software Dependencies Yes The main numerical libraries employed consist of the Python packages numpy, scipy and qpsolvers (v 1.1) and of the optimization library Ipopt (v 3.12.12) interfaced with the Python package pyomo.
Experiment Setup Yes In the following, we will always center the PCA in the barycenter of the data, i.e. a0 = n 1 Pn i=1 ai. Moreover, we consider the spline basis {ψj}J j=1 with J = 20 and equispaced knots in [0, 1]... After performing a PCA, a Support Vector Machine (SVM) classifier is fit, with parameters C = 1.0, radial basis function kernel and default value for the parameter γ