Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Iterative Reasoning through Energy Diffusion

Authors: Yilun Du, Jiayuan Mao, Joshua B. Tenenbaum

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that IRED outperforms existing methods in continuous-space reasoning, discrete-space reasoning, and planning tasks, particularly in more challenging scenarios. We show the effectiveness of IRED on three groups of tasks: continuous-space reasoning (e.g., matrix completion, inversion), discrete-space reasoning (e.g., Sodoku solving, graph connectivity prediction), and planning (e.g., finding paths on graphs).
Researcher Affiliation Academia Yilun Du 1 * Jiayuan Mao 1 * Joshua Tenenbaum 1 1MIT. Correspondence to: Yilun Du <EMAIL>, Jiayuan Mao <EMAIL>.
Pseudocode Yes We provide full pseudocode for training our approach in Section 3.4 with training following Algorithm 1 and inference following Algorithm 2.
Open Source Code Yes Code and visualizations are at https://energy-based-model.github.io/ired.
Open Datasets Yes We use the dataset from SAT-Net (Wang et al., 2019) as the training and standard test dataset. Our harder dataset is from RRN (Palm et al., 2018) which is a different Sudoku dataset where the number of given numbers is within [17, 34]. For Connectivity tasks, we generate random graphs using algorithms from Graves et al. (2016).
Dataset Splits No We aim to learn a neural network-based prediction model NNθ( ) which can generalize execution NNθ(x ) to a test distribution x RO , where x can be significantly larger and more challenging than the training data x X (e.g., of higher dimensions, or with larger number magnitudes), by leveraging a possibly increased computational budget.
Hardware Specification Yes Models were trained in approximately 2 hours on a single Nvidia RTX 2080 using a training batch size of 2048 and the Adam optimizer with learning rate 1e-4.
Software Dependencies No Models were trained in approximately 2 hours on a single Nvidia RTX 2080 using a training batch size of 2048 and the Adam optimizer with learning rate 1e-4.
Experiment Setup Yes Models were trained in approximately 2 hours on a single Nvidia RTX 2080 using a training batch size of 2048 and the Adam optimizer with learning rate 1e-4. For Sudoku, we train models for 50000 iterations using a single Nvidia RTX 2080 using a training batch size of 64 with the Adam optimizer with learning rate 1e-4.