Large-Scale Approximate Kernel Canonical Correlation Analysis
Authors: Weiran Wang, Karen Livescu
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we demonstrate the KNOI algorithm on two large-scale problems and compare it to several alternatives: CCA, solved exactly by SVD. FKCCA, low-rank approximation of KCCA using random Fourier features, with the CCA step solved exactly by SVD. NKCCA, low-rank approximation of KCCA using the Nystr om method, with the CCA step solved exactly by SVD. We implement KNOI in MATLAB with GPU support. In the first set of experiments, we demonstrate the scability and efficiency of KNOI on the MNIST8M dataset (Loosli et al., 2007). The dataset consists of 8.1 million 28 28 grayscale images of the digits 0-9. We divide each image into the left and right halves and use them as the two views in KCCA, so the input dimensionality is 392 for both views. The dataset is randomly split into training/test sets of size 8M/0.1M. The task is to learn L = 50 dimensional projections using KCCA, and the evaluation criterion is the total canonical correlation achieved on the test set (upperbounded by 50). The total canonical correlations achieved by each algorithm on the test set, together with the run times measured on a workstation with 6 3.6GHz CPUs and 64G main memory, are reported in Table 1. |
| Researcher Affiliation | Academia | Weiran Wang & Karen Livescu Toyota Technological Institute at Chicago 6045 S. Kenwood Ave., Chicago, IL 60637 Email: {weiranwang,klivescu}@ttic.edu |
| Pseudocode | Yes | Algorithm 1 KNOI: Stochastic optimization for approximate KCCA. |
| Open Source Code | Yes | Our MATLAB implementation is available at http://ttic.uchicago.edu/ wwang5/knoi.html |
| Open Datasets | Yes | In the first set of experiments, we demonstrate the scability and efficiency of KNOI on the MNIST8M dataset (Loosli et al., 2007). The dataset consists of 8.1 million 28 28 grayscale images of the digits 0-9. We use the Wisconsin X-ray microbeam (XRMB) corpus (Westbury, 1994) of simultaneously recorded speech and articulatory measurements from 47 American English speakers. |
| Dataset Splits | Yes | The dataset is randomly split into training/test sets of size 8M/0.1M. For KNOI, we tune hyperparameters on a rough grid based on total canonical correlation obtained on a random subset of the training set with 0.1M samples, and set the minibatch size b = 2500, time constant ρ = 0, learning rate η = 0.01, and momentum µ = 0.995. The XRMB speakers are split into disjoint sets of 35/8/2/2 speakers for feature learning/recognizer training/tuning/testing. The 35 speakers for feature learning are fixed; the remaining 12 are used in a 6-fold experiment (recognizer training on 8 speakers, tuning on 2 speakers, and testing on the remaining 2 speakers). For each fold, we select the hyperparameters based on recognition accuracy on the tuning set. |
| Hardware Specification | Yes | We run KNOI on an NVIDIA Tesla K40 GPU with 12G memory, and report the run times in parentheses in Table 1; the GPU provides a speedup of more than 12 times. The total canonical correlations achieved by each algorithm on the test set, together with the run times measured on a workstation with 6 3.6GHz CPUs and 64G main memory, are reported in Table 1. |
| Software Dependencies | No | The paper mentions 'MATLAB' and 'Kaldi speech recognition toolkit', but no specific version numbers for these or other libraries are provided. |
| Experiment Setup | Yes | For KNOI, we tune hyperparameters on a rough grid based on total canonical correlation obtained on a random subset of the training set with 0.1M samples, and set the minibatch size b = 2500, time constant ρ = 0, learning rate η = 0.01, and momentum µ = 0.995. For KNOI, we set M = 100000 and tune the optimization parameters on a rough grid. The tuned KNOI uses minibatch size b = 2500, time constant ρ = 0, fixed learning rate η = 0.01, and momentum µ = 0.995. |