Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Revisiting Non-Acyclic GFlowNets in Discrete Environments
Authors: Nikita Morozov, Ian Maksimov, Daniil Tiapkin, Sergey Samsonov
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In addition, we experimentally re-examine the concept of loss stability in nonacyclic GFlow Net training, as well as validate our own theoretical findings. |
| Researcher Affiliation | Academia | 1HSE University, Moscow, Russia 2CMAP CNRS Ecole polytechnique Institut Polytechnique de Paris, 91128, Palaiseau, France 3Universit e Paris-Saclay, CNRS, LMO, 91405, Orsay, France. Correspondence to: Nikita Morozov <EMAIL>. |
| Pseudocode | No | The paper describes methods and algorithms in paragraph form and through mathematical equations, but it does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code: github.com/Great Drake/non-acyclic-gfn. |
| Open Datasets | Yes | We consider two discrete environments for experimental evaluation: 1) a non-acyclic version of the hypergrid environment (Bengio et al., 2021) that was introduced in (Brunswic et al., 2024); 2) non-acyclic permutation generation environment from (Brunswic et al., 2024) with a harder variant of the reward function. |
| Dataset Splits | No | The paper focuses on generative models that sample objects from a distribution. It evaluates the models based on empirical distributions of generated samples (e.g., 'last 2 * 10^5 samples seen in training' or 'last 10^5 samples seen in training'), rather than using predefined training, validation, and test splits for an input dataset in a discriminative task. |
| Hardware Specification | No | This research was supported in part through computational resources of HPC facilities at HSE University (Kostenetskiy et al., 2021). |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' but does not specify its version or the versions of any other software libraries or programming languages used. |
| Experiment Setup | Yes | We use Adam optimizer with a learning rate of 10^-3 and a batch size of 16 (number of trajectories sampled at each training step). For log Zθ we use a larger learning rate of 10^-2... All models are trained until 2 * 10^6 trajectories are sampled... For SDB we set ε = 1.0 and η = 10^-3. |