reproducibilityindex.ai

Machine-Translated Knowledge Transfer for Commonsense Causal Reasoning

Authors: Jinyoung Yeo, Geungyu Wang, Hyunsouk Cho, Seungtaek Choi, Seung-won Hwang

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our evaluations with three languages, Korean, Chinese, and French, our proposed method consistently outperforms all baselines, achieving upto 69.0% reasoning accuracy, which is close to the state-of-the-art accuracy 70.2% achieved on English.
Researcher Affiliation	Academia	Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea Yonsei Univeristy, Seoul, Republic of Korea {jinyeo,prory}@postech.edu {posuer,hist0613,seungwonh}@yonsei.ac.kr
Pseudocode	Yes	Algorithm 1 Ofﬂine module in PSG
Open Source Code	No	The paper references a 'NE Causal Net' with a GitHub link in a footnote (https://cs-zyluo.github.io/Causal Net), but this appears to be a resource they used, not a release of their own method's source code. There is no explicit statement about releasing the code for the methodology described in this paper.
Open Datasets	Yes	Datasets To validate the effectiveness and robustness of our proposed method, we select three target languages, Korean, Chinese, and French, to cover diverse cultural and linguistic characteristics. We manually translate COPA dataset, i.e., 1,000 commonsense causal reasoning questions, into each language, dividing into development and test question set of 500 each. As additional development datasets for PSG, we leverage Causal Net6 and about 1M English web pages for word alignment. (Footnote 6: NE Causal Net: https://cs-zyluo.github.io/Causal Net)
Dataset Splits	Yes	We manually translate COPA dataset, i.e., 1,000 commonsense causal reasoning questions, into each language, dividing into development and test question set of 500 each.
Hardware Specification	No	No specific hardware details such as CPU/GPU models, memory, or specific computing platforms used for experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions using 'Naver Papago for all translations' and 'Word2Vec model', but no specific version numbers are provided for these or any other software dependencies.
Experiment Setup	No	While the paper discusses the effect of varying parameters like alignment confidence threshold (θ) and score weight parameter (λ) within its component study, it does not provide specific initial hyperparameter values or detailed training configurations (e.g., learning rates, batch sizes, optimizers, epochs) for the main experimental runs.