Unsupervised Learning with Contrastive Latent Variable Models

Authors: Kristen A. Severson, Soumya Ghosh, Kenney Ng4862-4869

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments Contrastive latent variable models have applications in subgroup discovery, feature selection, and de-noising, each of which is demonstrated here leveraging different modeling choices. We use examples from Abid et al. to highlight the similarities and differences between the two approaches. The results of c LVM as applied to synthetic datasets can be found in the supplemental information. Fig. 1 shows the latent representation using c PCA and robust c LVM for the naturally occurring missing level, 25%, 50%, and 75% missing data. Fig. 3c presents the results of this experiment. The latent projections for the c VAE cluster according to the digit labels. VAE on the other hand confounds the digits with the background and fails to recover meaningful latent projections.
Researcher Affiliation Collaboration Kristen A. Severson, Soumya Ghosh, Kenney Ng Center for Computational Health and MIT-IBM Watson AI Lab, IBM Research, 75 Binney St. Cambridge, Massachusetts, 02142
Pseudocode Yes Algorithm 1 Pseudocode 1: Input Model p(D; Θ), variational approximations q({zi, ti}n i=1, {zj}m j=1 | λ) 2: Output: Optimized Θ and variational parameters λ 3: Initialize λ and Θ. 4: repeat 5: Use reparameterization trick to compute unbiased estimates of the gradients of the objective in Eqn. 10, λ,Θ L(λ, Θ) 6: Update λ(l+1) ADAM(λ(l), λ L(λ, Θ)), Θ(l+1) ADAM(Θ(l), Θ L(λ, Θ)) 7: until convergence
Open Source Code No Supplemental information is available at https://arxiv.org/. (The paper does not explicitly state that the source code for their methodology is available at this link or elsewhere.)
Open Datasets Yes To demonstrate the use of c LVM for subgroup discovery, we use a dataset of mice protein expression levels (Higuera, Gardiner, and Cios 2015). To highlight the use of c LVM for subgroup discovery in high-dimensional data, we use a dataset of single cell RNA-Seq measurements (Zheng et al. 2017). The third example uses a dataset, referred to as m Health, that contains 23 measurements of body motion and vital signs from four types of signals (Banos et al. 2014; 2015). Finally, to demonstrate the utility of c VAE, we consider a dataset of corrupted images (see Fig. 3a). This dataset was created by overlaying a randomly selected set of 30, 000 MNIST (Le Cun et al. 1998) digits on randomly selected images of the grass category from Imagenet (Russakovsky et al. 2015).
Dataset Splits No The paper uses various datasets (e.g., mice protein expression, RNA-Seq, m Health, MNIST/Imagenet) and defines "target" and "background" datasets, but it does not specify explicit train/validation/test splits (e.g., percentages, sample counts, or citations to predefined splits) needed to reproduce the experiment's data partitioning.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No In our experiments we use Edward (Tran et al. 2016), a library for probabilistic modeling, to implement these inference strategies for the proposed models. optimization can proceed using a stochastic gradient ascent variant, e.g. ADAM (Kingma and Ba 2014). (No version numbers are provided for Edward or ADAM, or any other software dependency.)
Experiment Setup Yes The target dimension is two, the shared dimension is twenty, and ρ is 400. We use fully connected encoder and decoder networks with two hidden layers with 128 and 256 hidden units employing rectified-linear nonlinearities. The target latent space is set to two and an IG(10 3, 10 3) prior is used for the columns of the shared factor loading.