Learning to Scale Logits for Temperature-Conditional GFlowNets

Authors: Minsu Kim, Joohwan Ko, Taeyoung Yun, Dinghuai Zhang, Ling Pan, Woo Chang Kim, Jinkyoo Park, Emmanuel Bengio, Yoshua Bengio

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our code is available at https://github.com/dbsxodud-11/ logit-gfn. Our online learning with the Logit-GFN stands out with superior performance compared to GFN and alternative benchmarks, including well-established techniques in Reinforcement Learning (RL) (Schulman et al., 2017; Haarnoja et al., 2017) and Markov Chain Monte Carlo (MCMC) methods (Xie et al., 2020). We present experimental results on 4 biochemical tasks: QM9, s EH, TFBind8, and RNA-binding.
Researcher Affiliation Collaboration 1Work performed while the author was at the Mila Qu ebec AI Institute 2Korea Advanced Institute of Science and Technology 3Mila Qu ebec AI Institute 4Universit e de Montr eal 5Hong Kong University of Science and Technology 6Recursion 7CIFAR.
Pseudocode Yes Algorithm 1 Scientific Discovery with Temperature-Conditional GFlow Nets
Open Source Code Yes Our code is available at https://github.com/dbsxodud-11/ logit-gfn
Open Datasets Yes QM9: In QM9 task, we build an offline dataset D using under 50th percentile data, which consists of 29,382 samples. TFBind8: In TFBind8 task, we follow the method suggested in Design-bench (Trabucco et al., 2022). We build an offline dataset D using under 50th percentile data, which consists of 32,898 samples. RNA-Binding: In the RNA-binding task, we follow the method suggested in Boot Gen (Kim et al., 2023). We prepare an offline dataset consisting of 5,000 randomly generated RNA sequences.
Dataset Splits No The paper mentions using an 'offline dataset' and querying with different β values, but does not explicitly state the training, validation, and test dataset splits (e.g., 80/10/10 percentages or specific sample counts for each split) or any cross-validation strategy.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions using the Adam optimizer and refers to prior work for GFlow Net implementations but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes For QM9 and s EH tasks, we employ a two-layer architecture with 1024 hidden units, while for the other tasks, we choose to use a two-layer architecture with 128 hidden units. ... we employ the Adam optimizer ... with the following learning rates: 1 10 2 for Zθ and 1 10 4 for both the forward and backward policy. ... Table 1 summarizes the reward exponent and normalization constants for different task settings. For each active round, we generate 32 samples for evaluating loss. ... We perform 1 gradient step per active round and use 32 samples from PRT to compute loss. For temperature-conditional GFlow Nets, we introduce a two-layer MLP with a 32-dimensional hidden layer and a Leaky Re LU activation function for embedding inverse temperature β. Table 2. Temperature Distributions of Temperature-conditioned GFlow Nets for various tasks.