HiTKG: Towards Goal-Oriented Conversations via Multi-Hierarchy Learning

Authors: Jinjie Ni, Vlad Pandelea, Tom Young, Haicang Zhou, Erik Cambria11112-11120

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that Hi TKG achieves a significant improvement in the performance of turn-level goal learning compared with state-of-the-art baselines. Additionally, both automatic and human evaluation prove the effectiveness of the two-hierarchy learning framework for both short-term and long-term goal planning.
Researcher Affiliation Academia Jinjie Ni, Vlad Pandelea, Tom Young, Haicang Zhou, Erik Cambria Nanyang Technological University, Singapore {jinjie001, yang0552, haicang001}@e.ntu.edu.sg, {vlad.pandelea, cambria}@ntu.edu.sg
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code (no repository link, explicit code release statement, or code in supplementary materials).
Open Datasets Yes We conduct our evaluation on Open Dial KG (Moon et al. 2019). It is a dialogue KG dataset where each utterance of a dialogue is annotated with a KG path, which enables learning graph walkers to reason over the KG based on the conversations.
Dataset Splits Yes We follow the baselines and split it into train (70%), dev (15%), and test set (15%).
Hardware Specification Yes We use Pytorch (Paszke et al. 2019) to implement our model, which is trained on two RTX 8000 GPUs.
Software Dependencies No The paper mentions 'Pytorch' but does not specify a version number. Other software like 'ALBERT' and 'A2C' are also mentioned without version details. Thus, specific ancillary software details with version numbers are not provided.
Experiment Setup Yes We tune the hyperparameters by grid searching the hyperparameter space and choose the following settings that perform best: number of encoder/decoder layers: 2/6; dimension of the KG walker: 768; dimension of the KG embedding: 384 (stage 1), 256 (stage 2); loss coefficients γ/λ: 0.1/0.9; number of attention heads: 12; learning rate: 10 3; dropout rate: 0.1; L2 regularization parameter ϵ: 10 5; batch size: 10. We use learning rate scheduler to tune the learning rate manually and patient & early stopping to avoid overfitting. In addition, we use gradient clipping to avoid gradient explosions.