Improving Variational Autoencoders with Density Gap-based Regularization
Authors: Jianfei Zhang, Jun Bai, Chenghua Lin, Yanmeng Wang, Wenge Rong
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments on language modeling, latent space visualization, and interpolation, we show that our proposed method can solve both problems effectively and thus outperforms the existing methods in latent-directed generation. |
| Researcher Affiliation | Collaboration | Jianfei Zhang1,2 Jun Bai1,2 Chenghua Lin3 Yanmeng Wang4 Wenge Rong1,2 1State Key Laboratory of Software Development Environment, Beihang University, China 2School of Computer Science and Engineering, Beihang University, China 3Department of Computer Science, University of Sheffield, United Kingdom 4Ping An Technology, China |
| Pseudocode | No | The paper describes the methodology in text but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code and data are available at https://github.com/zhangjf-nlp/DG-VAEs. |
| Open Datasets | Yes | We consider four public available datasets commonly used for VAE-based language modeling tasks in our experiments: Yelp [42], Yahoo [42, 44], a downsampled version of Yelp [35] (we denote this as Short-Yelp), and a downsampled version of SNLI [5, 23]. The statistics of these datasets are illustrated in Table 1. |
| Dataset Splits | Yes | Table 1: Statistics of sentences in the datasets Dataset Train Valid Test Vocab size Length (avg std) Yelp 100,000 10,000 10,000 19997 98.01 48.86 Yahoo 100,000 10,000 10,000 20001 80.76 46.21 Short-Yelp 100,000 10,000 10,000 8411 10.96 3.60 SNLI 100,000 10,000 10,000 9990 11.73 4.33 |
| Hardware Specification | Yes | All models are trained with 4 NVIDIA Tesla V100 32GB GPUs. (Found in Appendix B, which is referenced in the paper's main body in the ethics checklist). |
| Software Dependencies | No | The paper mentions that 'Our implementation is based on PyTorch' in Appendix B but does not provide specific version numbers for software dependencies like PyTorch, Python, or other libraries. |
| Experiment Setup | Yes | Configurations We completely follow Zhu et al. [46] in the models backbone structures, data pre-processing, and training procedure, which we describe in detail in Appendix B. (From Appendix B: "Adam optimizer with an initial learning rate of 0.001", "batch size of 32", "training for 100 epochs", "latent variable dimension is set to 32".) |