Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Neuro-Symbolic Hierarchical Rule Induction
Authors: Claire Glanois, Zhaohui Jiang, Xuening Feng, Paul Weng, Matthieu Zimmer, Dong Li, Wulong Liu, Jianye Hao
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically validate our model on various tasks (ILP, visual genome, reinforcement learning) against relevant state-of-the-art methods, including traditional ILP methods and neurosymbolic models. Our contributions can be summarized as follows: (4) Empirical validation on various domains (see Section 6). |
| Researcher Affiliation | Collaboration | 1IT University of Copenhagen, Denmark 2UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China 3Huawei Noah s Ark Lab, China 4School of Computing and Intelligence, Tianjin University. |
| Pseudocode | No | The paper describes the inference steps using mathematical equations but does not provide any explicit pseudocode blocks or algorithm figures. |
| Open Source Code | Yes | The source code of HRI and the scripts to reproduce the experimental results can be found at <https://github.com/claireaoi/hierarchical-rule-induction>. |
| Open Datasets | Yes | For (3), we consider a large domain from Visual Genome (Krishna et al., 2017)). Our model outperforms other methods such as NLIL (Yang & Song, 2020) on those tasks. We also empirically validate all our design choices. GQA (Hudson & Manning, 2019b) which is a preprocessed version of the Visual Genome dataset (Krishna et al., 2017) |
| Dataset Splits | Yes | We use randomly generated training data for this task given the range of integers. Hyperparameter details are given in Appendix E. In Table 14, the hyperparameters train-num-constants and eval-num-constants represent the number of constants during training and evaluation, respectively. |
| Hardware Specification | No | The paper mentions that its implementation uses GPU and is 'GPU-based', but does not specify any particular GPU model, CPU, or other hardware components used for the experiments. |
| Software Dependencies | No | The paper does not explicitly state specific software dependencies with version numbers (e.g., Python, PyTorch, or other libraries). |
| Experiment Setup | Yes | Hyperparameter details are given in Appendix E. We list relevant generic and task-specific hyperparameters used for our training method in Tables 13 and 14, respectively. For instance, Table 13 lists 'temperature', 'Gumbel-Scale', 'lr', and Table 14 lists 'max-depth', 'train-steps', 'eval-steps', 'train-num-constants'. |