Layer-Wise Representation Fusion for Compositional Generalization

Authors: Yafang Zheng, Lei Lin, Shuangtao Li, Yuxuan Yuan, Zhaohong Lai, Shan Liu, Biao Fu, Yidong Chen, Xiaodong Shi

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental LRF achieves promising results on two realistic benchmarks, empirically demonstrating the effectiveness of our proposal.
Researcher Affiliation Collaboration 1Department of Artificial Intelligence, School of Informatics, Xiamen University 2 Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan (Xiamen University), Ministry of Culture and Tourism, China 3Kuaishou Technology, Beijing, China
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Codes are available at https://github.com/thinkaboutzero/LRF.
Open Datasets Yes CoGnition is an English Chinese (En-Zh) translation dataset...CFQ is automatically generated from a set of rules...Both are cited: "(Li et al. 2021)" and "(Keysers et al. 2020)".
Dataset Splits Yes CoGnition...It consists of a training set of 196,246 sentence pairs, a validation set and a test set of 10,000 samples." and "CFQ...Each split dataset consists of a training set of 95,743, a validation set and a test set of 11,968 examples.
Hardware Specification Yes We use one GeForce GTX 2080Ti for training with 100,000 steps and decoding.
Software Dependencies No The paper mentions software like Fairseq, Jieba, Moses tokenizer, GPT2BPE tokenizer, and RoBERTa but does not provide specific version numbers for any of them.
Experiment Setup Yes For CoGnition...training with 100,000 steps and decoding. For CFQ...base RoBERTa with 12 encoder layers, which is combined with a Transformer decoder that has 2 decoder layers with hidden size 256 and feed-forward dimension 512...training with 45,000 steps and decoding.