Goal-directed Generation of Discrete Structures with Conditional Generative Models

Authors: Amina Mollaysa, Brooks Paige, Alexandros Kalousis

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our methodology on two tasks: generating molecules with user-defined properties, and identifying short python expressions which evaluate to a given target value. In both cases we find improvements over maximum likelihood estimation and other baselines. The paper includes sections like '3 Experiments', '3.1 Conditional generation of mathematical expressions', and '3.2 Conditional generation of molecules', along with various tables presenting quantitative results and comparisons.
Researcher Affiliation Academia Amina Mollaysa University of Geneva University of Applied Sciences Western Switzerland maolaaisha.aminanmu@hesge.ch Brooks Paige University College London Alan Turing Institute b.paige@ucl.ac.uk Alexandros Kalousis University of Geneva University of Applied Sciences Western Switzerland alexandros.kalousis@hesge.ch
Pseudocode Yes Listing 1: CFG for inverse calculator
Open Source Code No The paper does not contain an explicit statement or link indicating that the source code for their methodology is open-source or publicly available. While it mentions RDKit (an open-source software), it does not release its own code.
Open Datasets Yes We experiment with two datasets: QM9 [19] which contains 133k organic compounds that have up to nine heavy atoms and Ch EMBL [17] of molecules that have been synthesized and tested against biological targets
Dataset Splits Yes we split them into training, test, and validation subsets.
Hardware Specification No The paper does not specify any particular hardware components (e.g., GPU models, CPU models, or memory) used for running the experiments.
Software Dependencies No The paper mentions 'RDKit' as chemoinformatics software and a '3-layer stacked LSTM sequence model', but does not provide specific version numbers for any software dependencies or frameworks.
Experiment Setup Yes In all experiments, we model the conditional distributions pθ(x|y) using a 3-layer stacked LSTM sequence model; for architecture details see Appendix D and Fig. D.3. We determine the values of the hyper-parameters based on the statistics of the ℓ1(f(xj), yi) distance on the training set, details are explained section D. For training purposes we normalize the properties values to a zero mean and a standard deviation of one.