Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
NuTrea: Neural Tree Search for Context-guided Multi-hop KGQA
Authors: Hyeong Kyu Choi, Seunghun Lee, Jaewon Chu, Hyunwoo J. Kim
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The general effectiveness of our approach is demonstrated through experiments on three major multi-hop KGQA benchmark datasets, and our extensive analyses further validate its expressiveness and robustness. and 4 Experiments |
| Researcher Affiliation | Academia | Hyeong Kyu Choi Computer Sciences University of Wisconsin-Madison, Seunghun Lee Computer Science & Engineering Korea University, Jaewon Chu Computer Science & Engineering Korea University, Hyunwoo J. Kim Computer Science & Engineering Korea University. |
| Pseudocode | Yes | Figure 1 provides a holistic view of our method, and pseudocode is in the supplement. |
| Open Source Code | Yes | Code is available at https://github.com/mlvlab/Nu Trea. |
| Open Datasets | Yes | We experiment on three large-scale multi-hop KGQA datasets: Meta QA [35], Web Questions SP (WQP) [16] and Complex Web Questions (CWQ) [17]. |
| Dataset Splits | Yes | Following the common evaluation practice of previous works, we test the model that achieved the best performance on the validation set. and We use the same EF values throughout training, validation, and testing. |
| Hardware Specification | No | The paper mentions 'Training GPU Hours' and 'inference time' but does not specify any concrete hardware details such as specific GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | For WQP, 2 Nu Trea layers with subtree depth 𝐾= 1 is used, while CWQ with more complex questions uses 3 layers with depth 𝐾= 2. |