Beyond CCA: Moment Matching for Multi-View Models

Authors: Anastasia Podosinnikova, Francis Bach, Simon Lacoste-Julien

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate performance of the proposed models and estimation techniques on experiments with both synthetic and real datasets.
Researcher Affiliation Academia Anastasia Podosinnikova ANASTASIA.PODOSINNIKOVA@INRIA.FR Francis Bach FRANCIS.BACH@INRIA.FR Simon Lacoste-Julien FIRSTNAME.LASTNAME@INRIA.FR INRIA Ecole normale sup erieure, Paris
Pseudocode No The paper describes algorithmic steps in paragraph form, but does not include structured pseudocode or an explicitly labeled algorithm block within the provided text.
Open Source Code Yes The (Matlab/C++) code for reproducing the experiments of this paper is available at https://github.com/anastasia-podosinnikova/cca.
Open Datasets Yes Following Vinokourov et al. (2002), we illustrate the performance of DCCA by extracting bilingual topics from the Hansard collection (Vinokourov & Girolami, 2002) with aligned English and French proceedings of the 36-th Canadian Parliament.
Dataset Splits No We sample synthetic data to have ground truth information for comparison. We sample from linear DCCA which extends linear CCA (7) such that each view is xj Poisson(Djα + Fjβj). ... For each experiment, Dj and Fj, for j = 1, 2, are sampled once and, then, the x-observations are sampled for different sample sizes N = {500, 1, 000, 2, 000, 5, 000, 10, 000}, 5 times for each N.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Matlab/C++ code' but does not provide specific software dependencies with version numbers (e.g., library names, frameworks, or solvers).
Experiment Setup Yes We sample from linear DCCA which extends linear CCA (7) such that each view is xj Poisson(Djα + Fjβj). The sources α Gamma(c, b) and the noise sources βj Gamma(cj, bj), for j = 1, 2, are sampled from the gamma distribution (where b is the rate parameter). Let sj Poisson(Djα) be the part of the sample due to the sources and nj Poisson(Fjβj) be the part of the sample due to the noise (i.e., xj = sj + nj). Then we define the expected sample length due to the sources and noise, respectively, as Ljs := E[P m sjm] and Ljn := E[P m njm]. For sampling, the target values Ls = L1s = L2s and Ln = L1n = L2n are fixed and the parameters b and bj are accordingly set to ensure these values: b = Kc/Ls and bj = Kjcj/Ln (see Appendix B.2 of Podosinnikova et al. (2015)). For the larger dimensional example (Fig. 2, right), each column of the matrices Dj and Fj, for j = 1, 2, is sampled from the symmetric Dirichlet distribution with the concentration parameter equal to 0.5. For the smaller 2D example (Fig. 2, left), they are fixed: D1 = D2 with [D1]1 = [D1]2 = 0.5 and F1 = F2 with [F1]11 = [F1]22 = 0.9 and [F1]12 = [F1]21 = 0.1.