Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Adaptive Inference-Time Scaling via Cyclic Diffusion Search

Authors: Gyubin Lee, Bao Truong, Jaesik Yoon, Dongwoo Lee, Minsu Kim, Yoshua Bengio, Sungjin Ahn

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proposed method on a diverse suite of challenging tasks that require generating samples in sparse or previously unexplored regions at training phase: (1) toy Mixture of Gaussian(Mo G), (2) point-mass maze navigation [19], (3) Sudoku puzzle completion [18, 29], (4) path generation on unseen pixel maze image [12], (5) molecular structure prediction [11], and (6) text-to-image generation. These tasks present distinct challenges: Mo G shows the importance of multiple go-back temperature pool as the proof-of-concept; point-mass maze requires long-horizon planning over 1000 steps; Sudoku demands logical consistency across row, column, and block constraints; path generation on unseen maze image tests generalization to novel environmental structures; molecular structure prediction requires generating valid 3D conformations under chemical and physical constraints; and image generation illustrates that our approach also operates effectively in the high-dimensional setting, demonstrating its scalability beyond structured low-dimensional tasks.
Researcher Affiliation	Collaboration	Gyubin Lee KAIST EMAIL Truong Nhat Nguyen Bao* KAIST EMAIL Jaesik Yoon KAIST & SAP EMAIL Dongwoo Lee KAIST EMAIL Minsu Kim Mila Quebec AI Institute KAIST EMAIL Yoshua Bengio Mila Quebec AI Institute Université de Montréal EMAIL Sungjin Ahn KAIST & NYU EMAIL
Pseudocode	Yes	Algorithm 1 ABCD: Adaptive Bi-directional Cyclic Diffusion 1: procedure ABCD(verifier, T, T , max_iter, κ, N, K, J) 2: Initialize N particles {x(i) T } from N(0, I) 3: Denoise to obtain {x(i) 0 } via Fast denoising 4: while less than max_iter do 5: Selection: Select top-K particles using verifier 6: If Selected top-K particles come from smallest tg for κ times consecutively 7: break 8: Copy: Replicate each K particles J time 9: Noising: Send each K particles to each tg T 10: Denoising: Denoise from each tg go-back temperature 11: end while 12: return Select best particle from top-K particles 13: end procedure
Open Source Code	Yes	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We will upload the data and code for this paper.
Open Datasets	Yes	We evaluate our proposed method on a diverse suite of challenging tasks that require generating samples in sparse or previously unexplored regions at training phase: (1) toy Mixture of Gaussian(Mo G), (2) point-mass maze navigation [19], (3) Sudoku puzzle completion [18, 29], (4) path generation on unseen pixel maze image [12], (5) molecular structure prediction [11], and (6) text-to-image generation.
Dataset Splits	Yes	Sudoku puzzle completion [18, 29]... We trained all models on puzzles with 31–42 given digits [29] and tested them on more challenging instances containing only 17–28 givens [18]... For Molecule Generation, the basic setting is adopted from EDM [11]. The diffusion model is trained to generate 3D molecular geometries using the QM9 dataset [20], which contains approximately 130k small molecules, each with up to 9 heavy atoms. We utilized the standard splits of this dataset for training, validation, and testing.
Hardware Specification	Yes	Our implementation is built using the Py Torch framework. The experiments are conducted on 2 machines: Ubuntu 20.04 machine equipped with an Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz with 112 cores, 384 GB RAM, and NVIDIA Ge Force RTX 4090 GPUs; Ubuntu 22.04 machine equipped with an Intel(R) Xeon(R) Gold 6230R CPU @ 2.10GHz with 104 cores, 256 GB RAM, and NVIDIA Ge Force RTX 4090 GPUs. Each experiment is run individually on an NVIDIA Ge Force RTX 4090 GPU.
Software Dependencies	No	Our implementation is built using the Py Torch framework. The experiments are conducted on 2 machines: Ubuntu 20.04 machine equipped with an Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz with 112 cores, 384 GB RAM, and NVIDIA Ge Force RTX 4090 GPUs; Ubuntu 22.04 machine equipped with an Intel(R) Xeon(R) Gold 6230R CPU @ 2.10GHz with 104 cores, 256 GB RAM, and NVIDIA Ge Force RTX 4090 GPUs. Each experiment is run individually on an NVIDIA Ge Force RTX 4090 GPU.
Experiment Setup	Yes	For OGBench Maze, we employed a soft verifier that scores each generated plan based on the proportion of collision-free states throughout the trajectory and the proximity to the goal at the end of the trajectory... For So P, we used M = 4 and K = 8 in Maze Giant, and M = 1 and K = 32 in Maze Large. BS used M = 8 and K = 4 in both mazes, with a lookahead estimator [17] to have a better predicted x0(xt) with value guidance. SMC [24] was implemented with POTENTIAL TYPE = sum , λ = 0.1 and used N = 32 particles in both mazes. ABCD was configured with N = 32, K = 2 and J = 16 in Maze Giant and N = 32, K = 1 and J = 32 in Maze Large. The adaptive terminal condition was set once more than 90% of top-K particles consistently originated from the zero noise level over κ consecutive steps. We used κ = 30 in Maze Giant and κ = 5 in Maze Large. All methods used a base diffusion model trained with 256 denoising steps. ABCD employed jumpy denoising with a jump length of 10.