Grammar Variational Autoencoder
Authors: Matt J. Kusner, Brooks Paige, José Miguel Hernández-Lobato
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the GVAE on two tasks for generating discrete data: 1) generating simple arithmetic expressions and 2) generating valid molecules. We show not only does our model produce a higher proportion of valid outputs than a character based autoencoder, it also produces smoother latent representations. We also show that this learned latent space is effective for searching for arithmetic expressions that fit data, for finding better drug-like molecules, and for making accurate predictions about target properties. |
| Researcher Affiliation | Academia | 1Alan Turing Institute 2University of Warwick 3University of Cambridge. |
| Pseudocode | Yes | Algorithm 1 Sampling from the decoder |
| Open Source Code | Yes | Code available at: https://github.com/mkusner/grammarVAE |
| Open Datasets | Yes | The training data for the CVAE and GVAE models are 250,000 SMILES strings (Weininger, 1988) extracted at random from the ZINC database by Gómez-Bombarelli et al. (2016b). |
| Dataset Splits | No | The paper mentions a 'left-out test set with 10% of the data' for evaluating the GP model, but does not explicitly specify a validation set for training the VAE or for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running the experiments. |
| Software Dependencies | No | The paper mentions types of neural networks (e.g., LSTMs, GRUs, DCNNs) and deep convolutional neural networks but does not provide specific software dependencies with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | No | The paper describes the probabilistic setup for the VAE (e.g., 'q(z|X) is a Gaussian distribution whose mean and variance parameters are the output of the encoder network, with an isotropic Gaussian prior p(z) = N(0, I)') and optimization method ('gradient descent') but does not provide concrete hyperparameter values such as learning rate, batch size, or number of epochs for the VAE training. It also refers to supplementary material for network architecture details. |