Layer-Wise Representation Fusion for Compositional Generalization
Authors: Yafang Zheng, Lei Lin, Shuangtao Li, Yuxuan Yuan, Zhaohong Lai, Shan Liu, Biao Fu, Yidong Chen, Xiaodong Shi
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | LRF achieves promising results on two realistic benchmarks, empirically demonstrating the effectiveness of our proposal. |
| Researcher Affiliation | Collaboration | 1Department of Artificial Intelligence, School of Informatics, Xiamen University 2 Key Laboratory of Digital Protection and Intelligent Processing of Intangible Cultural Heritage of Fujian and Taiwan (Xiamen University), Ministry of Culture and Tourism, China 3Kuaishou Technology, Beijing, China |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Codes are available at https://github.com/thinkaboutzero/LRF. |
| Open Datasets | Yes | CoGnition is an English Chinese (En-Zh) translation dataset...CFQ is automatically generated from a set of rules...Both are cited: "(Li et al. 2021)" and "(Keysers et al. 2020)". |
| Dataset Splits | Yes | CoGnition...It consists of a training set of 196,246 sentence pairs, a validation set and a test set of 10,000 samples." and "CFQ...Each split dataset consists of a training set of 95,743, a validation set and a test set of 11,968 examples. |
| Hardware Specification | Yes | We use one GeForce GTX 2080Ti for training with 100,000 steps and decoding. |
| Software Dependencies | No | The paper mentions software like Fairseq, Jieba, Moses tokenizer, GPT2BPE tokenizer, and RoBERTa but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | For CoGnition...training with 100,000 steps and decoding. For CFQ...base RoBERTa with 12 encoder layers, which is combined with a Transformer decoder that has 2 decoder layers with hidden size 256 and feed-forward dimension 512...training with 45,000 steps and decoding. |