Gaussian Mixture Variational Autoencoder with Contrastive Learning for Multi-Label Classification
Authors: Junwen Bai, Shufeng Kong, Carla P Gomes
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We have various setups to validate the performance of C-GMVAE. First, we compare the example-F1, micro-F1 and macro-F1 scores, Hamming accuracies and precision@1 of different methods. Second, we compare their performance when fewer training data are available. Third, an ablation study shows the importance of the proposed modules. Finally, we demonstrate the interpretability of the label embeddings on the e Bird dataset. Our code is publicly available. |
| Researcher Affiliation | Academia | Junwen Bai 1 Shufeng Kong 1 Carla Gomes 1 Department of Computer Science, Cornell University, Ithaca, USA. Correspondence to: Shufeng Kong <sk2299@cornell.edu>. |
| Pseudocode | No | The paper describes the C-GMVAE model and its components in text and with diagrams, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is publicly available1. 1https://github.com/Junwen Bai/c-gmvae |
| Open Datasets | Yes | The feature pre-processing is standard following previous works (Lanchantin et al., 2019; Bai et al., 2020) and the datasets are public2. 2http://mulan.sourceforge.net/datasets-mlc.html |
| Dataset Splits | Yes | Each dataset is separated into training (80%), validation (10%) and testing (10%) splits. |
| Hardware Specification | Yes | We use one Nvidia V100 GPU for all experiments. |
| Software Dependencies | No | The paper mentions using Adam (Kingma & Ba, 2014) as an optimizer but does not specify version numbers for other key software components, libraries, or frameworks used in the implementation (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The latent dimensionality is 64. The feature encoder is an MLP with 3 hidden layers of sizes [256, 512, 256]. The label encoder has 2 hidden layers of sizes [512, 256]. The decoder contains 2 hidden layers of sizes [512, 512]. On reuters and bookmarks, we add one more hidden layer with 512 units to the decoder. The embedding size E is 2048 (tuned within the range [512, 1024, 2048, 3072]). We set α = 1 (tuned within [0.1, 0.5, 1, 1.5, 2]), β = 0.5 (tuned within [0.1, 0.5, 1, 1.5, 2.0]) for most runs. We tune learning rates from 0.0001 to 0.004 with interval 0.0002, dropout ratio from [0.3, 0.5, 0.7], and weight decay from [0, 0.01, 0.0001]. Grid search is adopted for tuning. Every batch in our experiments requires less than 16GB memory. The number of epochs is 100 by default. |