Expected flow networks in stochastic environments and two-player zero-sum games

Authors: Marco Jiralerspong, Bilun Sun, Danilo Vucetic, Tianyu Zhang, Yoshua Bengio, Gauthier Gidel, Nikolay Malkin

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments to investigate whether EFlow Nets can effectively learn in stochastic environments compared to related methods ( 4.1) and whether AFlow Nets are effective learners of adversarial gameplay, as measured by their performance against contemporary approaches ( 4.2).
Researcher Affiliation Academia Mila Qu ebec AI Institute, Universit e de Montr eal n marco.jiralerspong,bilun.sun,danilo.vucetic,tianyu.zhang, yoshua.bengio,gidelgau,nikolay.malkin o @mila.quebec
Pseudocode Yes Algorithm 1: Branch-adjusted AFlow Net Training
Open Source Code Yes Code: https://github.com/GFNOrg/Adversarial Flow Networks.
Open Datasets Yes We evaluate EFlow Nets in a protein design task from Jain et al. (2022).
Dataset Splits No The paper describes sampling methods and training policies, but does not provide specific details on training, validation, and test dataset splits (e.g., percentages or counts) or refer to standard splits for the datasets used.
Hardware Specification Yes GPU 1x RTX3090Ti (Tic-tac-toe) 1x RTX8000 (Connect-4)
Software Dependencies No The paper mentions implementing models (e.g., Alpha Zero implementation, SAC reimplementation) but does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries.
Experiment Setup Yes num trajectories epoch: 10240, batch size: 512 (Tic-tac-toe) / 1024 (Connect-4), num steps: 500 (Tic-tac-toe) / 250 (Connect-4), replay buffer capacity: 10240 (Tic-tac-toe) / 250000 (Connect-4), learning rate: 1e-3, learning rate Z: 5e-2, num residual blocks: 10 (Tic-tac-toe) / 15 (Connect-4)