Large-Scale Sparse Kernel Canonical Correlation Analysis

Authors: Viivi Uurtio, Sahely Bhadra, Juho Rousu

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our empirical experiments demonstrate that grad KCCA outperforms state-of-the-art CCA methods in terms of speed and robustness to noise both in simulated and real-world datasets. and 5. Experiments section.
Researcher Affiliation Academia 1Department of Computer Science, Aalto University, Espoo, Finland 2Helsinki Institute for Information Technology, Helsinki, Finland 3Computer Science and Engineering, Indian Institute of Technology Palakkad, Palakkad, India.
Pseudocode Yes Algorithm 1 grad KCCA
Open Source Code Yes The code is available on https://github. com/aalto-ics-kepaco/grad KCCA.
Open Datasets Yes MNIST handwritten digits. The MNIST handwritten digits dataset contains 60 000 training and 10 000 testing images of handwritten digits. As in (Andrew et al., 2013), we analyze the relations of the left and right halves of images of the handwritten digits. Every image consists of 28 28 matrix of pixels. Every pixel represents a grey scale value ranging from one to 256. The 14 left and right columns are separated to form the two views, making 392 features in each view. and Media Mill dataset. The Media Mill dataset consists of 43907 examples that correspond to keyframes taken from video shots. Every keyframe is represented by 120 visual features and 101 text annotations, or labels, respectively. (citations Le Cun, 1998 and Snoek et al., 2006 are provided in bibliography).
Dataset Splits Yes In all of the experiments, we randomly select 500 examples from the simulated training dataset for hyperparameter tuning by repeated nested crossvalidation. and The MNIST handwritten digits dataset contains 60 000 training and 10 000 testing images of handwritten digits.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9') used for the experiments.
Experiment Setup Yes In the experiments, where the linear kernels are tested for grad KCCA and KCCA, we apply the following monotone algebraic relations. We generate polynomial relations of the form f(x) = xd, where x R+ and d = 1, 2, 3, that is linear, quadratic, and cubic relations, respectively. We also generate a transcendental relation, the exponential relation, of the form f(x) = exp(x), where x R+. We consider a two-to-two relation, that is, the two first columns of Y and X satisfy y1 +y2 = f(x1 +x2)+ξ where ξ N(0, 0.05) denotes a vector of normal distributed noise. and Hyperparameter settings. In all of the experiments, we randomly select 500 examples from the simulated training dataset for hyperparameter tuning by repeated nested crossvalidation. For grad KCCA and SCCA-HSIC, we tune the ℓ1 sparsity constraints. For KCCA, we tune the regularization hyperparameters. For DCCA, we tune the neural network structure, that is the number of layers and the number of hidden units. For RCCA and KNOI, we tune the number of random Fourier features.