A Simple and Provable Algorithm for Sparse Diagonal CCA

Authors: Megasthenis Asteris, Anastasios Kyrillidis, Oluwasanmi Koyejo, Russell Poldrack

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate the proposed scheme and apply it on a real neuroimaging dataset to investigate associations between brain activity and behavior measurements. 4. Experiments We empirically evaluate our algorithm on two real datasets: i) a publicly available breast cancer dataset (Chin et al., 2006), also used in the evaluation of (Witten et al., 2009), and ii) a neuroimaging dataset obtained from the Human Connectome Project (Van Essen et al., 2013) on which we investigate associations between brain activation and behavior measurements.
Researcher Affiliation Academia Megasthenis Asteris MEGAS@UTEXAS.EDU Anastasios Kyrillidis ANASTASIOS@UTEXAS.EDU Department of Electrical and Computer Engineering, The University of Texas at Austin Oluwasanmi Koyejo SANMI@ILLINOIS.EDU Stanford University & University of Illinois at Urbana-Champaign Russell Poldrack POLDRACK@STANFORD.EDU Department of Psychology, Stanford University
Pseudocode Yes Algorithm 1 Span CCA
Open Source Code No The paper mentions 'our prototypical Python implementation of Span CCA' but does not provide any link to a repository or an explicit statement about the code being publicly available.
Open Datasets Yes We empirically evaluate our algorithm on two real datasets: i) a publicly available breast cancer dataset (Chin et al., 2006)... and ii) a neuroimaging dataset obtained from the Human Connectome Project (Van Essen et al., 2013)...
Dataset Splits No The paper describes the datasets used (breast cancer and HCP) and their dimensions, but it does not specify any training, validation, or test splits for the data. The experiments involve applying the algorithm directly to the datasets rather than training a model over splits.
Hardware Specification Yes To demonstrate the parallelizability of our algorith, we run Span CCA for the aforementioned task on the brain imaging data for various values of the number N of workers on a single server with 36 physical processing cores7 and approximately 250Gb of main memory. 7Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz
Software Dependencies No The paper mentions 'our prototypical Python implementation' and the use of the 'nilearn python package' but does not specify any version numbers for Python or any of its libraries.
Experiment Setup Yes Subsequently, we run Span CCA (Alg. 1) with parameters T = 104, r = 3, and target sparsity equal to that of the former PMD output. ... We apply our Span CCA algorithm on the HCP data with arbitrarily selected parameters T = 106 and r = 5. We set the target sparsity at 15% for each canonical vector.