Unsupervised Explanation Generation via Correct Instantiations
Authors: Sijie Cheng, Zhiyong Wu, Jiangjie Chen, Zhixing Li, Yang Liu, Lingpeng Kong
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on two standard explanation benchmarks, i.e., Com VE and e-SNLI. According to both automatic and human evaluations, NEON outperforms baselines, even for those with human-annotated instantiations. |
| Researcher Affiliation | Collaboration | Sijie Cheng1,2*, Zhiyong Wu1 , Jiangjie Chen2, Zhixing Li3, Yang Liu5,6, Lingpeng Kong1,4 1Shanghai Artificial Intelligence Laboratory 2Fudan University 3Full Truck Alliance 4The University of Hong Kong 5Institute for AI Industry Research, Tsinghua University 6Department of Computer Science and Technology, Tsinghua University |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The resources of NEON are available at: https://github.com/Shark-NLP/Neon. |
| Open Datasets | Yes | Our experiments are conducted on the two important explanation benchmarks, Com VE (Wang et al. 2020) and e-SNLI (Camburu et al. 2018). |
| Dataset Splits | Yes | Then they divide all these annotated instances into train/dev/test datasets with 10,000/997/1,000 instances. As for the e-SNLI task, the cn and sn can be seen as entailment and contradiction statements, respectively. Filtering the odd instances with only entailment or contradiction statement, our obtained train/dev/test is 5,189/3,280/2,640. |
| Hardware Specification | Yes | Our experiments are conducted with 8 A100 GPUs. |
| Software Dependencies | No | The paper mentions using "OPT-175B", "GPT2-large", and "RoBERTa-large" models, but does not provide specific version numbers for underlying software libraries, frameworks, or programming languages (e.g., PyTorch version, Python version). |
| Experiment Setup | Yes | In the first phase, to fix the maxlength of the context window (nctx = 2048), we set the number of examples as K = 16. Moreover, the max length of generated instantiations is 25 for Com VE and 40 for e SNLI. In the second phase, the max length of generated explanations is 30 for both tasks. The hyper-parameter of Top-p is 0.9, and the temperature is 0 for all generation models. |