Discrete Object Generation with Reversible Inductive Construction

Authors: Ari Seff, Wenda Zhou, Farhan Damani, Abigail Doyle, Ryan P. Adams

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed approach on two highly structured discrete domains, molecules and Laman graphs, and find that it compares favorably to alternative methods at capturing distributional statistics for a host of semantically relevant metrics. Quantitative evaluation indicates that the proposed method can effectively model highly structured discrete distributions while adhering to strict validity constraints.
Researcher Affiliation Academia Ari Seff Princeton University Princeton, NJ aseff@princeton.edu Wenda Zhou Columbia University New York, NY wz2335@columbia.edu Farhan Damani Princeton University Princeton, NJ fdamani@princeton.edu Abigail Doyle Princeton University Princeton, NJ agdoyle@princeton.edu Ryan P. Adams Princeton University Princeton, NJ rpa@princeton.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks for its own method.
Open Source Code Yes We formulate our approach, generative reversible inductive construction (Gen RIC)1, as the equilibrium distribution of a Markov chain that only visits valid objects, without a need for inefficient rejection sampling. 1https://github.com/Princeton LIPS/reversible-inductive-construction
Open Datasets Yes For molecules we test the proposed approach on the ZINC dataset, which contains about 250K drug-like molecules from the ZINC database [35]. For Laman graphs, we generate synthetic graphs randomly via Algorithm 7 from Moussaoui [29], originally proposed for evaluating geometric constraint solvers embedded within CAD programs.
Dataset Splits Yes The model is trained on 220K molecules according to the same train/test split as in Jin et al. [19], Kusner et al. [21].
Hardware Specification No We acknowledge computing resources from Columbia University s Shared Research Computing Facility project, which is supported by NIH Research Facility Improvement Grant 1G20RR030893-01, and associated funds from the New York State Empire State Development, Division of Science Technology and Innovation (NYSTAR) Contract C090171, both awarded April 15, 2010. This statement describes funding and a facility but lacks specific hardware details (e.g., GPU/CPU models).
Software Dependencies No The paper mentions software like 'RDKit [24]' but does not provide specific version numbers for software dependencies used in their own experiments.
Experiment Setup No Unless otherwise stated, the results reported in Sections 3 and 4, use a geometric distribution with five expected steps for the corruption sequence length. For each method, we obtain 20K samples either by running pre-trained models [19, 14, 21], by accessing pre-sampled sets [26, 34, 25], or by training models from scratch [33]2. While some details are given, comprehensive hyperparameter values (e.g., learning rate, batch size, specific optimizer settings) are not provided in the main text.