Learning GFlowNets From Partial Episodes For Improved Convergence And Stability
Authors: Kanika Madan, Jarrid Rector-Brooks, Maksym Korablyov, Emmanuel Bengio, Moksh Jain, Andrei Cristian Nica, Tom Bosc, Yoshua Bengio, Nikolay Malkin
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on two synthetic and four realworld domains support three empirical claims: (1) Sub TB(π) improves convergence of GFlow Nets in previously studied environments: models trained with Sub TB approach the target distribution in fewer training steps and are less sensitive to hyperparameter choices. (2) Sub TB enables training of GFlow Nets in environments where past approaches perform poorly due to sparsity of the reward function or length of action sequences. (3) The benefits of Sub TB(π) are explained by lower variance of the stochastic gradient, with the parameter π interpolating between the high-bias, low-variance DB objective and the low-bias, high-variance TB objective. |
| Researcher Affiliation | Collaboration | 1Mila Qu ebec AI Institute 2Universit e de Montr eal 3Recursion 4Politehnica University of Bucharest 5CIFAR Fellow. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link to its own open-source code or explicitly state that its code for the methodology is released. It mentions using published code from other papers: "All experiments with Sub TB(π) are based upon the published code of Malkin et al. (2022), which extends that of Bengio et al. (2021a)." |
| Open Datasets | Yes | We take 6438 known AMP sequences and 9522 non-AMP sequences from the DBAASP database Pirtskhalava et al. (2021). |
| Dataset Splits | Yes | The classifier that serves as the proxy reward function is trained on this dataset, using 20ππΈπ πΆπΈππof the data as the validation set. |
| Hardware Specification | No | The paper mentions "computational resources provided by the Digital Research Alliance of Canada" but does not provide specific hardware details such as GPU/CPU models, memory amounts, or cloud instance types. |
| Software Dependencies | No | The paper mentions software like "Adam optimizer", "Transformer", "Py Rosetta", and "Auto Dock Vina" but does not specify their version numbers or other key library versions necessary for reproducibility (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The optimal learning rate for each experiment is chosen from {0.0005, 0.00075, 0.001, 0.003, 0.005, 0.0075, 0.01}, and π= 0.9 is chosen as the optimal value from the set {0.8, 0.9, 0.99}. All models are trained with the Adam optimizer and a batch size of 16 for a total of 106 trajectories (62500 batches). |