Learning to Scale Logits for Temperature-Conditional GFlowNets
Authors: Minsu Kim, Joohwan Ko, Taeyoung Yun, Dinghuai Zhang, Ling Pan, Woo Chang Kim, Jinkyoo Park, Emmanuel Bengio, Yoshua Bengio
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our code is available at https://github.com/dbsxodud-11/ logit-gfn. Our online learning with the Logit-GFN stands out with superior performance compared to GFN and alternative benchmarks, including well-established techniques in Reinforcement Learning (RL) (Schulman et al., 2017; Haarnoja et al., 2017) and Markov Chain Monte Carlo (MCMC) methods (Xie et al., 2020). We present experimental results on 4 biochemical tasks: QM9, s EH, TFBind8, and RNA-binding. |
| Researcher Affiliation | Collaboration | 1Work performed while the author was at the Mila Qu ebec AI Institute 2Korea Advanced Institute of Science and Technology 3Mila Qu ebec AI Institute 4Universit e de Montr eal 5Hong Kong University of Science and Technology 6Recursion 7CIFAR. |
| Pseudocode | Yes | Algorithm 1 Scientific Discovery with Temperature-Conditional GFlow Nets |
| Open Source Code | Yes | Our code is available at https://github.com/dbsxodud-11/ logit-gfn |
| Open Datasets | Yes | QM9: In QM9 task, we build an offline dataset D using under 50th percentile data, which consists of 29,382 samples. TFBind8: In TFBind8 task, we follow the method suggested in Design-bench (Trabucco et al., 2022). We build an offline dataset D using under 50th percentile data, which consists of 32,898 samples. RNA-Binding: In the RNA-binding task, we follow the method suggested in Boot Gen (Kim et al., 2023). We prepare an offline dataset consisting of 5,000 randomly generated RNA sequences. |
| Dataset Splits | No | The paper mentions using an 'offline dataset' and querying with different β values, but does not explicitly state the training, validation, and test dataset splits (e.g., 80/10/10 percentages or specific sample counts for each split) or any cross-validation strategy. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments. |
| Software Dependencies | No | The paper mentions using the Adam optimizer and refers to prior work for GFlow Net implementations but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | For QM9 and s EH tasks, we employ a two-layer architecture with 1024 hidden units, while for the other tasks, we choose to use a two-layer architecture with 128 hidden units. ... we employ the Adam optimizer ... with the following learning rates: 1 10 2 for Zθ and 1 10 4 for both the forward and backward policy. ... Table 1 summarizes the reward exponent and normalization constants for different task settings. For each active round, we generate 32 samples for evaluating loss. ... We perform 1 gradient step per active round and use 32 samples from PRT to compute loss. For temperature-conditional GFlow Nets, we introduce a two-layer MLP with a 32-dimensional hidden layer and a Leaky Re LU activation function for embedding inverse temperature β. Table 2. Temperature Distributions of Temperature-conditioned GFlow Nets for various tasks. |