Learning Iterative Reasoning through Energy Minimization

Authors: Yilun Du, Shuang Li, Joshua Tenenbaum, Igor Mordatch

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically illustrate that our iterative reasoning approach can solve more accurate and generalizable algorithmic reasoning tasks in both graph and continuous domains. and 4. Experiments
Researcher Affiliation Collaboration Yilun Du 1 Shuang Li 1 Joshua Tenenbaum 1 Igor Mordatch 2 1MIT CSAIL 2Google Brain.
Pseudocode Yes We provide pseudocode for training IREM in Algorithm 1 and executing algorithmic reasoning with IREM our approach in Algorithm 2... and Algorithm 1 IREM training algorithm, Algorithm 2 IREM prediction algorithm, Algorithm 3 IREM Training with External Memory.
Open Source Code Yes Code and additional information is available at https://energy-based-model.github.io/iterativereasoning-as-energy-minimization/.
Open Datasets No The paper describes generating its own datasets (e.g., 'We randomly sample a value for each edge...', 'We randomly construct two separate vectors...') and provides details on their construction in Appendix C, but does not provide specific access information (link, DOI, formal citation) to make these datasets publicly available.
Dataset Splits No The paper specifies training on certain graph sizes (e.g., 'size 2 to 10') and testing on larger/harder problems (e.g., 'size 15', 'larger magnitudes'), but it does not explicitly state the use of a validation set or specific training/validation/test splits (e.g., 80/10/10 split).
Hardware Specification Yes Models were trained in approximately 2 hours on a single Nvidia Titan X GPU using a training batch size of 64 and the Adam optimizer with learning rate 1e-4.
Software Dependencies No The paper mentions software components like 'Adam optimizer' and 'GINEConv layer', but does not specify their version numbers (e.g., PyTorch 1.9, TensorFlow 2.x, Python 3.x).
Experiment Setup Yes Models were trained in approximately 2 hours on a single Nvidia Titan X GPU using a training batch size of 64 and the Adam optimizer with learning rate 1e-4. Each model was trained for 10,000 iterations and evaluated on 1000 test problems. and Each model was trained with five steps of iterative computation, with Ponder Net trained with a halting geometric distribution of 0.8.