Machine-Translated Knowledge Transfer for Commonsense Causal Reasoning

Authors: Jinyoung Yeo, Geungyu Wang, Hyunsouk Cho, Seungtaek Choi, Seung-won Hwang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our evaluations with three languages, Korean, Chinese, and French, our proposed method consistently outperforms all baselines, achieving upto 69.0% reasoning accuracy, which is close to the state-of-the-art accuracy 70.2% achieved on English.
Researcher Affiliation Academia Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea Yonsei Univeristy, Seoul, Republic of Korea {jinyeo,prory}@postech.edu {posuer,hist0613,seungwonh}@yonsei.ac.kr
Pseudocode Yes Algorithm 1 Offline module in PSG
Open Source Code No The paper references a 'NE Causal Net' with a GitHub link in a footnote (https://cs-zyluo.github.io/Causal Net), but this appears to be a resource they used, not a release of their own method's source code. There is no explicit statement about releasing the code for the methodology described in this paper.
Open Datasets Yes Datasets To validate the effectiveness and robustness of our proposed method, we select three target languages, Korean, Chinese, and French, to cover diverse cultural and linguistic characteristics. We manually translate COPA dataset, i.e., 1,000 commonsense causal reasoning questions, into each language, dividing into development and test question set of 500 each. As additional development datasets for PSG, we leverage Causal Net6 and about 1M English web pages for word alignment. (Footnote 6: NE Causal Net: https://cs-zyluo.github.io/Causal Net)
Dataset Splits Yes We manually translate COPA dataset, i.e., 1,000 commonsense causal reasoning questions, into each language, dividing into development and test question set of 500 each.
Hardware Specification No No specific hardware details such as CPU/GPU models, memory, or specific computing platforms used for experiments are mentioned in the paper.
Software Dependencies No The paper mentions using 'Naver Papago for all translations' and 'Word2Vec model', but no specific version numbers are provided for these or any other software dependencies.
Experiment Setup No While the paper discusses the effect of varying parameters like alignment confidence threshold (θ) and score weight parameter (λ) within its component study, it does not provide specific initial hyperparameter values or detailed training configurations (e.g., learning rates, batch sizes, optimizers, epochs) for the main experimental runs.