Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Sparse Canonical Correlation Analysis

Authors: Yongchun Li, Santanu Dey, Weijun Xie

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Section 5 numerically test the proposed formulations and algorithms.
Researcher Affiliation Academia Yongchun Li University of Tennessee EMAIL Santanu S. Dey Georgia Tech EMAIL Weijun Xie Georgia Tech EMAIL
Pseudocode Yes Algorithm 1 An exact algorithm for SCCA (1) when s1 r and s2 ˆr
Open Source Code Yes The codes and data used in our experiments are available at https://github.com/yongchunli-13/SCCA.git.
Open Datasets Yes The codes and data used in our experiments are available at https://github.com/yongchunli-13/SCCA.git. Also, the paper cites UCI datasets [5] and breast cancer dataset [8], which are commonly publicly available.
Dataset Splits No The paper states "The dataset is split into the first n variables and the remaining m variables to construct the sample covariance matrices A, B, C." in Section 5.1. This refers to a partitioning of variables, not a train/validation/test split for data samples typically used in model evaluation.
Hardware Specification Yes All the experiments are conducted in Python 3.6 with calls to Gurobi 9.5.2 and MOSEK 10.0.29 on a PC with 10-core CPU, 16-core GPU, and 16GB of memory.
Software Dependencies Yes All the experiments are conducted in Python 3.6 with calls to Gurobi 9.5.2 and MOSEK 10.0.29 on a PC with 10-core CPU, 16-core GPU, and 16GB of memory.
Experiment Setup Yes Section 5.1 "Experimental setup" details the generation of synthetic data, including parameters (n, m, s1, s2) and sampling N=5,000 data samples. It also states the time limit for experiments: "the time limit is one hour".