Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
HiTKG: Towards Goal-Oriented Conversations via Multi-Hierarchy Learning
Authors: Jinjie Ni, Vlad Pandelea, Tom Young, Haicang Zhou, Erik Cambria11112-11120
AAAI 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that Hi TKG achieves a significant improvement in the performance of turn-level goal learning compared with state-of-the-art baselines. Additionally, both automatic and human evaluation prove the effectiveness of the two-hierarchy learning framework for both short-term and long-term goal planning. |
| Researcher Affiliation | Academia | Jinjie Ni, Vlad Pandelea, Tom Young, Haicang Zhou, Erik Cambria Nanyang Technological University, Singapore EMAIL, EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code (no repository link, explicit code release statement, or code in supplementary materials). |
| Open Datasets | Yes | We conduct our evaluation on Open Dial KG (Moon et al. 2019). It is a dialogue KG dataset where each utterance of a dialogue is annotated with a KG path, which enables learning graph walkers to reason over the KG based on the conversations. |
| Dataset Splits | Yes | We follow the baselines and split it into train (70%), dev (15%), and test set (15%). |
| Hardware Specification | Yes | We use Pytorch (Paszke et al. 2019) to implement our model, which is trained on two RTX 8000 GPUs. |
| Software Dependencies | No | The paper mentions 'Pytorch' but does not specify a version number. Other software like 'ALBERT' and 'A2C' are also mentioned without version details. Thus, specific ancillary software details with version numbers are not provided. |
| Experiment Setup | Yes | We tune the hyperparameters by grid searching the hyperparameter space and choose the following settings that perform best: number of encoder/decoder layers: 2/6; dimension of the KG walker: 768; dimension of the KG embedding: 384 (stage 1), 256 (stage 2); loss coefficients γ/λ: 0.1/0.9; number of attention heads: 12; learning rate: 10 3; dropout rate: 0.1; L2 regularization parameter ϵ: 10 5; batch size: 10. We use learning rate scheduler to tune the learning rate manually and patient & early stopping to avoid overfitting. In addition, we use gradient clipping to avoid gradient explosions. |