Beyond CCA: Moment Matching for Multi-View Models
Authors: Anastasia Podosinnikova, Francis Bach, Simon Lacoste-Julien
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate performance of the proposed models and estimation techniques on experiments with both synthetic and real datasets. |
| Researcher Affiliation | Academia | Anastasia Podosinnikova ANASTASIA.PODOSINNIKOVA@INRIA.FR Francis Bach FRANCIS.BACH@INRIA.FR Simon Lacoste-Julien FIRSTNAME.LASTNAME@INRIA.FR INRIA Ecole normale sup erieure, Paris |
| Pseudocode | No | The paper describes algorithmic steps in paragraph form, but does not include structured pseudocode or an explicitly labeled algorithm block within the provided text. |
| Open Source Code | Yes | The (Matlab/C++) code for reproducing the experiments of this paper is available at https://github.com/anastasia-podosinnikova/cca. |
| Open Datasets | Yes | Following Vinokourov et al. (2002), we illustrate the performance of DCCA by extracting bilingual topics from the Hansard collection (Vinokourov & Girolami, 2002) with aligned English and French proceedings of the 36-th Canadian Parliament. |
| Dataset Splits | No | We sample synthetic data to have ground truth information for comparison. We sample from linear DCCA which extends linear CCA (7) such that each view is xj Poisson(Djα + Fjβj). ... For each experiment, Dj and Fj, for j = 1, 2, are sampled once and, then, the x-observations are sampled for different sample sizes N = {500, 1, 000, 2, 000, 5, 000, 10, 000}, 5 times for each N. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Matlab/C++ code' but does not provide specific software dependencies with version numbers (e.g., library names, frameworks, or solvers). |
| Experiment Setup | Yes | We sample from linear DCCA which extends linear CCA (7) such that each view is xj Poisson(Djα + Fjβj). The sources α Gamma(c, b) and the noise sources βj Gamma(cj, bj), for j = 1, 2, are sampled from the gamma distribution (where b is the rate parameter). Let sj Poisson(Djα) be the part of the sample due to the sources and nj Poisson(Fjβj) be the part of the sample due to the noise (i.e., xj = sj + nj). Then we define the expected sample length due to the sources and noise, respectively, as Ljs := E[P m sjm] and Ljn := E[P m njm]. For sampling, the target values Ls = L1s = L2s and Ln = L1n = L2n are fixed and the parameters b and bj are accordingly set to ensure these values: b = Kc/Ls and bj = Kjcj/Ln (see Appendix B.2 of Podosinnikova et al. (2015)). For the larger dimensional example (Fig. 2, right), each column of the matrices Dj and Fj, for j = 1, 2, is sampled from the symmetric Dirichlet distribution with the concentration parameter equal to 0.5. For the smaller 2D example (Fig. 2, left), they are fixed: D1 = D2 with [D1]1 = [D1]2 = 0.5 and F1 = F2 with [F1]11 = [F1]22 = 0.9 and [F1]12 = [F1]21 = 0.1. |