On the Markov Property of Neural Algorithmic Reasoning: Analyses and Methods
Authors: Montgomery Bohde, Meng Liu, Alexandra Saxton, Shuiwang Ji
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive experiments, based on the CLRS-30 algorithmic reasoning benchmark, demonstrate that both Forget Net and G-Forget Net achieve better generalization capability than existing methods. |
| Researcher Affiliation | Academia | Montgomery Bohde , Meng Liu , Alexandra Saxton, Shuiwang Ji Department of Computer Science & Engineering Texas A&M University College Station, TX 77843, USA {mbohde,mengliu,allie.saxton,sji}@tamu.edu |
| Pseudocode | No | The paper provides mathematical formulations and architectural diagrams (Figure 1) but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is publicly available at https://github.com/divelab/Forget Net. |
| Open Datasets | Yes | Our extensive experiments, based on the CLRS-30 algorithmic reasoning benchmark, demonstrate that both Forget Net and G-Forget Net achieve better generalization capability than existing methods. |
| Dataset Splits | Yes | We perform experiments on the standard out-of-distribution (OOD) splits present in the CLRS-30 algorithmic reasoning benchmark (Veliˇckovi c et al., 2022a). To be specific, we train on inputs with 16 or fewer nodes, and use inputs with 16 nodes for validation. |
| Hardware Specification | No | The paper mentions training models but does not specify any particular hardware (e.g., GPU models, CPU types) used for the experiments. |
| Software Dependencies | No | The paper mentions using "Adam optimizer (Kingma & Ba, 2015)" and a "cosine learning rate scheduler," but it does not provide specific version numbers for these or any other software libraries/dependencies. |
| Experiment Setup | Yes | Specifically, we employ the Adam optimizer (Kingma & Ba, 2015) with a cosine learning rate scheduler and an initial learning rate of 0.0015. The models are trained for 10,000 steps with a batch size of 32. |