A Theory of Non-acyclic Generative Flow Networks
Authors: Leo Brunswic, Yinchuan Li, Yushun Xu, Yijun Feng, Shangling Jui, Lizhuang Ma
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on graphs and continuous tasks validate those principles. |
| Researcher Affiliation | Collaboration | 1 Huawei Shanghai Research Center 2 Shanghai Jiaotong University 3 Huawei Noah s Ark Lab, Beijing, China |
| Pseudocode | No | The paper presents theoretical definitions and mathematical equations but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about the availability of its source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We conduct experiments on Point-Robot-Sparse continuous control tasks with sparse rewards. ... All other experimental settings and algorithm hyperparameters in Roint-Robot-Sparse are the same as in Li et al. (2023d). We refer to Li et al. (2023d) for more details. For Hypergrids, S = [1,W]^D together with transitions of the form s -> s + (0,...,0,1,0,...,0). For Cayley Graphs, S = S_p, the group of permutations of 0,...,p-1. The edges are given by a set of generators (sigma_1,...,sigma_q). |
| Dataset Splits | No | The paper does not provide specific details on dataset splits (e.g., percentages, sample counts, or cross-validation setup) for training, validation, or testing. |
| Hardware Specification | No | The paper does not specify the hardware used for experiments, such as specific CPU or GPU models. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers (e.g., programming languages or libraries). |
| Experiment Setup | Yes | The agent starts at (5,5) and the maximum episode length is 12. We changed the angle range of the agent s movement from (0, 90 ) to (0, 360 ). The policy pi_f(s_0) = U(S) is non-trainable to emulate a random initial instance. Taking S_1 = {sigma | sigma(i) = i, i <= k} leads to emulating a partial sorting algorithm, see figure 4 for p = 20 and R_1 with c = 20 and k = 1. |