Unsupervised Learning with Contrastive Latent Variable Models
Authors: Kristen A. Severson, Soumya Ghosh, Kenney Ng4862-4869
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments Contrastive latent variable models have applications in subgroup discovery, feature selection, and de-noising, each of which is demonstrated here leveraging different modeling choices. We use examples from Abid et al. to highlight the similarities and differences between the two approaches. The results of c LVM as applied to synthetic datasets can be found in the supplemental information. Fig. 1 shows the latent representation using c PCA and robust c LVM for the naturally occurring missing level, 25%, 50%, and 75% missing data. Fig. 3c presents the results of this experiment. The latent projections for the c VAE cluster according to the digit labels. VAE on the other hand confounds the digits with the background and fails to recover meaningful latent projections. |
| Researcher Affiliation | Collaboration | Kristen A. Severson, Soumya Ghosh, Kenney Ng Center for Computational Health and MIT-IBM Watson AI Lab, IBM Research, 75 Binney St. Cambridge, Massachusetts, 02142 |
| Pseudocode | Yes | Algorithm 1 Pseudocode 1: Input Model p(D; Θ), variational approximations q({zi, ti}n i=1, {zj}m j=1 | λ) 2: Output: Optimized Θ and variational parameters λ 3: Initialize λ and Θ. 4: repeat 5: Use reparameterization trick to compute unbiased estimates of the gradients of the objective in Eqn. 10, λ,Θ L(λ, Θ) 6: Update λ(l+1) ADAM(λ(l), λ L(λ, Θ)), Θ(l+1) ADAM(Θ(l), Θ L(λ, Θ)) 7: until convergence |
| Open Source Code | No | Supplemental information is available at https://arxiv.org/. (The paper does not explicitly state that the source code for their methodology is available at this link or elsewhere.) |
| Open Datasets | Yes | To demonstrate the use of c LVM for subgroup discovery, we use a dataset of mice protein expression levels (Higuera, Gardiner, and Cios 2015). To highlight the use of c LVM for subgroup discovery in high-dimensional data, we use a dataset of single cell RNA-Seq measurements (Zheng et al. 2017). The third example uses a dataset, referred to as m Health, that contains 23 measurements of body motion and vital signs from four types of signals (Banos et al. 2014; 2015). Finally, to demonstrate the utility of c VAE, we consider a dataset of corrupted images (see Fig. 3a). This dataset was created by overlaying a randomly selected set of 30, 000 MNIST (Le Cun et al. 1998) digits on randomly selected images of the grass category from Imagenet (Russakovsky et al. 2015). |
| Dataset Splits | No | The paper uses various datasets (e.g., mice protein expression, RNA-Seq, m Health, MNIST/Imagenet) and defines "target" and "background" datasets, but it does not specify explicit train/validation/test splits (e.g., percentages, sample counts, or citations to predefined splits) needed to reproduce the experiment's data partitioning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | In our experiments we use Edward (Tran et al. 2016), a library for probabilistic modeling, to implement these inference strategies for the proposed models. optimization can proceed using a stochastic gradient ascent variant, e.g. ADAM (Kingma and Ba 2014). (No version numbers are provided for Edward or ADAM, or any other software dependency.) |
| Experiment Setup | Yes | The target dimension is two, the shared dimension is twenty, and ρ is 400. We use fully connected encoder and decoder networks with two hidden layers with 128 and 256 hidden units employing rectified-linear nonlinearities. The target latent space is set to two and an IG(10 3, 10 3) prior is used for the columns of the shared factor loading. |