NeVAE: A Deep Generative Model for Molecular Graphs

Authors: Bidisha Samanta, Abir DE, Gourhari Jana, Pratim Kumar Chattaraj, Niloy Ganguly, Manuel Gomez Rodriguez1110-1117

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments reveal that our model can discover plausible, diverse and novel molecules more effectively than several state of the art methods. and We evaluate our model using molecules from two publicly available datasets, ZINC (Irwin et al. 2012) and QM9 (Ramakrishnan et al. 2014), and show that our model beats the state of the art in terms of several relevant quality metrics, i.e., validity, novelty and uniqueness.
Researcher Affiliation Academia Bidisha Samanta IIT Kharagpur bidisha@iitkgp.ac.in Abir De MPI-SWS ade@mpi-sws.org Gourhari Jana IIT Kharagpur gour2015hari@iitkgp.ac.in Pratim Kumar Chattaraj IIT Kharagpur pkc@chem.iitkgp.ernet.in Niloy Ganguly IIT Kharagpur niloy@cse.iitkgp.ac.in Manuel Gomez Rodriguez MPI-SWS manuelgr@mpi-sws.org
Pseudocode No The paper includes architectural diagrams (Figure 1 and Figure 2) but no formal pseudocode or algorithm blocks.
Open Source Code Yes We are releasing an open source implementation of our model in Tensorflow. 1https://github.com/Networks-Learning/nevae
Open Datasets Yes We sample 10,000 drug-like commercially available molecules from the ZINC dataset (Irwin et al. 2012) with E[n] = 44 atoms and 10,000 molecules from the QM9 dataset (Ramakrishnan et al. 2014; Ruddigkeit et al. 2012) with E[n] = 21 atoms.
Dataset Splits No The paper mentions training the VAE on 10,000 molecules from ZINC and QM9 datasets but does not explicitly provide training/validation/test splits for the VAE model itself. For the Bayesian optimization sub-experiment, it mentions splitting 3,000 molecules from ZINC into training (90%) and test (10%) sets for the sparse Gaussian process.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models or types) used for running its experiments.
Software Dependencies No The paper mentions 'Tensorflow' as the implementation framework but does not specify its version number or any other software dependencies with specific versions.
Experiment Setup No The paper describes the batching strategy ('batches comprised of molecules with the same number of nodes') and details regarding masking during training and sampling, but it does not specify explicit hyperparameter values such as learning rates, optimizers, or number of epochs.