Expected flow networks in stochastic environments and two-player zero-sum games
Authors: Marco Jiralerspong, Bilun Sun, Danilo Vucetic, Tianyu Zhang, Yoshua Bengio, Gauthier Gidel, Nikolay Malkin
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments to investigate whether EFlow Nets can effectively learn in stochastic environments compared to related methods ( 4.1) and whether AFlow Nets are effective learners of adversarial gameplay, as measured by their performance against contemporary approaches ( 4.2). |
| Researcher Affiliation | Academia | Mila Qu ebec AI Institute, Universit e de Montr eal n marco.jiralerspong,bilun.sun,danilo.vucetic,tianyu.zhang, yoshua.bengio,gidelgau,nikolay.malkin o @mila.quebec |
| Pseudocode | Yes | Algorithm 1: Branch-adjusted AFlow Net Training |
| Open Source Code | Yes | Code: https://github.com/GFNOrg/Adversarial Flow Networks. |
| Open Datasets | Yes | We evaluate EFlow Nets in a protein design task from Jain et al. (2022). |
| Dataset Splits | No | The paper describes sampling methods and training policies, but does not provide specific details on training, validation, and test dataset splits (e.g., percentages or counts) or refer to standard splits for the datasets used. |
| Hardware Specification | Yes | GPU 1x RTX3090Ti (Tic-tac-toe) 1x RTX8000 (Connect-4) |
| Software Dependencies | No | The paper mentions implementing models (e.g., Alpha Zero implementation, SAC reimplementation) but does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries. |
| Experiment Setup | Yes | num trajectories epoch: 10240, batch size: 512 (Tic-tac-toe) / 1024 (Connect-4), num steps: 500 (Tic-tac-toe) / 250 (Connect-4), replay buffer capacity: 10240 (Tic-tac-toe) / 250000 (Connect-4), learning rate: 1e-3, learning rate Z: 5e-2, num residual blocks: 10 (Tic-tac-toe) / 15 (Connect-4) |