reproducibilityindex.ai

Learning Iterative Reasoning through Energy Diffusion

Authors: Yilun Du, Jiayuan Mao, Joshua B. Tenenbaum

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments show that IRED outperforms existing methods in continuous-space reasoning, discrete-space reasoning, and planning tasks, particularly in more challenging scenarios. We show the effectiveness of IRED on three groups of tasks: continuous-space reasoning (e.g., matrix completion, inversion), discrete-space reasoning (e.g., Sodoku solving, graph connectivity prediction), and planning (e.g., finding paths on graphs).
Researcher Affiliation	Academia	Yilun Du 1 * Jiayuan Mao 1 * Joshua Tenenbaum 1 1MIT. Correspondence to: Yilun Du <yilundu@mit.edu>, Jiayuan Mao <jiayuanm@mit.edu>.
Pseudocode	Yes	We provide full pseudocode for training our approach in Section 3.4 with training following Algorithm 1 and inference following Algorithm 2.
Open Source Code	Yes	Code and visualizations are at https://energy-based-model.github.io/ired.
Open Datasets	Yes	We use the dataset from SAT-Net (Wang et al., 2019) as the training and standard test dataset. Our harder dataset is from RRN (Palm et al., 2018) which is a different Sudoku dataset where the number of given numbers is within [17, 34]. For Connectivity tasks, we generate random graphs using algorithms from Graves et al. (2016).
Dataset Splits	No	We aim to learn a neural network-based prediction model NNθ( ) which can generalize execution NNθ(x ) to a test distribution x RO , where x can be significantly larger and more challenging than the training data x X (e.g., of higher dimensions, or with larger number magnitudes), by leveraging a possibly increased computational budget.
Hardware Specification	Yes	Models were trained in approximately 2 hours on a single Nvidia RTX 2080 using a training batch size of 2048 and the Adam optimizer with learning rate 1e-4.
Software Dependencies	No	Models were trained in approximately 2 hours on a single Nvidia RTX 2080 using a training batch size of 2048 and the Adam optimizer with learning rate 1e-4.
Experiment Setup	Yes	Models were trained in approximately 2 hours on a single Nvidia RTX 2080 using a training batch size of 2048 and the Adam optimizer with learning rate 1e-4. For Sudoku, we train models for 50000 iterations using a single Nvidia RTX 2080 using a training batch size of 64 with the Adam optimizer with learning rate 1e-4.