Genetic-guided GFlowNets for Sample Efficient Molecular Optimization
Authors: Hyeonah Kim, Minsu Kim, Sanghyeok Choi, Jinkyoo Park
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments demonstrate the effectiveness and practical applicability of the proposed method. First, our method achieves the highest total score of 16.213 across 23 oracles in the Practical Molecular Optimization benchmark [8], outperforming all other baselines. |
| Researcher Affiliation | Collaboration | 1Korea Advanced Institute of Science and Technology (KAIST), 2OMELET |
| Pseudocode | Yes | Algorithm 1 Genetic GFN training with limited reward calls |
| Open Source Code | Yes | The codes are available at https://github.com/hyeonahkimm/genetic_gfn. |
| Open Datasets | Yes | According to the PMO benchmark guidelines [8], the pre-training is conducted on ZINC 250K. |
| Dataset Splits | No | The paper describes the evaluation protocol of the PMO benchmark (e.g., AUC, limited oracle calls) but does not provide specific train/validation/test dataset splits with percentages, sample counts, or explicit references to predefined splits for the data used to train their model. |
| Hardware Specification | Yes | Throughout the experiments, we utilize a 48-core CPU, Intel(R) Xeon(R) Gold 5317 CPU @ 3.00GHz, and a single GPU. |
| Software Dependencies | No | The paper mentions implementing Genetic GFN on top of the PMO benchmark source code and adopting REINVENT implementation, but it does not specify version numbers for general software dependencies like Python, PyTorch, or other libraries. |
| Experiment Setup | Yes | For instance, the batch size and learning rate are set as 64 and 0.0005 according to REINVENT in the PMO benchmark. On the other hand, the mutation rate and the number of training loops are set to 0.01 and 8 following GEGL. We use 64 samples for the replay training and population size, the same as the batch size without tuning. Lastly, the learning rate of Z, the partition function, is set to 0.1, also without tuning. |