reproducibilityindex.ai

LogiGAN: Learning Logical Reasoning via Adversarial Pre-training

Authors: Xinyu Pi, Wanjun Zhong, Yan Gao, Nan Duan, Jian-Guang Lou

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Both base and large size language models pre-trained with Logi GAN demonstrate obvious performance improvement on 12 datasets requiring general reasoning abilities, revealing the fundamental role of logic in broad reasoning, as well as the effectiveness of Logi GAN. Ablation studies on Logi GAN components reveal the relative orthogonality between linguistic and logic abilities and suggest that reflective thinking s facilitation effect might also generalize to machine learning.
Researcher Affiliation	Collaboration	1University of Illinois Urbana-Champaign, Urbana, USA 2Sun Yat-Sen University 3Microsoft Research Asia
Pseudocode	Yes	Algorithm 1: Adversarial Training Process
Open Source Code	Yes	The code is released in https://github.com/microsoft/ContextualSP/tree/master/logigan
Open Datasets	Yes	To test the effectiveness of Logi GAN, we extensively experiment on 12 datasets requiring general reasoning via natural language. Specifically, Re Clor (Yu et al., 2020), Logi QA (Liu et al., 2021a), Adversarial NLI ANLI, (Nie et al., 2019), focuses especially on logical reasoning, Tell Me Why (Lal et al., 2021) on abuductive reasoning, Hotpot QA (Yang et al., 2018a) on multi-hop reasoning, Quo Ref (Dasigi et al., 2019) on reasoing with co-reference resolution, Mu Tual (Cui et al., 2020), DREAM (Sun et al., 2019)), SAMSum (Gliwa et al., 2019) on reasoning in conversational scenarios, and Narrative QA (s Koˇ ciský et al., 2018), RACE (Lai et al., 2017), XSum (Narayan et al., 2018) on general verbal reasoning.
Dataset Splits	No	The paper mentions evaluating on 'development sets' in Table 1, but it does not provide specific training/validation/test split percentages or sample counts for any of the datasets used.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. It only defers to Appendix D for implementation details.
Software Dependencies	No	The paper mentions using T5 and ALBERT-large models, but it does not specify version numbers for these models or for any underlying software frameworks (e.g., PyTorch, TensorFlow) or libraries used to implement the method.
Experiment Setup	No	The paper states, 'We leave discussions of the rest implementation details and hyper-parameter settings of pre-training and downstream fine-tuning in Appendix D.' This indicates that the specific experimental setup details are not present in the main text provided.