Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation

Authors: Emmanuel Bengio, Moksh Jain, Maksym Korablyov, Doina Precup, Yoshua Bengio

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We prove that any global minimum of the proposed objectives yields a policy which samples from the desired distribution, and demonstrate the improved performance and diversity of GFlow Net on a simple domain where there are many modes to the reward function, and on a molecule synthesis task.
Researcher Affiliation Collaboration 1Mila, 2Mc Gill University, 3Université de Montréal, 4Deep Mind, 5Microsoft
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes All implementations are available at https://github.com/bengioe/gflownet.
Open Datasets Yes We pretrain the proxy with a semi-curated semi-random dataset of 300k molecules (see A.4) down to a test MSE of 0.6; molecules are scored according to the docking score (Trott and Olson, 2010), renormalized so that most scores fall between 0 and 10 (to have R(x) > 0). ... The dataset itself comes from the ZINC database (Sterling and Irwin, 2015), prefiltered for molecules below 20 heavy atoms, randomly subsampled.
Dataset Splits No The paper mentions training, testing, and initial dataset sizes, but it does not specify explicit percentages or absolute counts for training, validation, and test splits needed to reproduce the experiment.
Hardware Specification Yes All experiments were run on a server with an Intel Xeon Gold 6248 CPU @ 2.50GHz, and NVIDIA Quadro RTX 6000 or A6000 GPUs.
Software Dependencies No The paper mentions several software tools and libraries such as RDKit, PyTorch, and AutoDock Vina, but it does not specify any version numbers for these software components.
Experiment Setup Yes We run the above experiment for R0 ∈ {10−1, 10−2, 10−3} with n = 4, H = 8. ... During training, sampling follows exploratory policy P(a|s) which is a mixture between π(a|s) (Eq. 5), used with probability 0.95, and a uniform distribution over allowed actions with probability 0.05. ... For the molecule discovery task, we initialize an MPNN proxy to predict docking scores from Auto Dock (Trott and Olson, 2010), with |D0| = 2000 molecules. At the end of each round we generate 200 molecules which are evaluated with Auto Dock and used to update the proxy.