MetaTPTrans: A Meta Learning Approach for Multilingual Code Representation Learning
Authors: Weiguo Pian, Hanyu Peng, Xunzhu Tang, Tiezhu Sun, Haoye Tian, Andrew Habib, Jacques Klein, Tegawendé F. Bissyandé
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on the code summarization and code completion tasks to verify the effectiveness of our approach. The results demonstrate the superiority of our approach with significant improvements on state-of-the-art baselines. |
| Researcher Affiliation | Collaboration | 1 Sn T, University of Luxembourg, Luxembourg 2 Baidu Inc., Beijing, China 3 CITADEL, Université Virtuelle du Burkina Faso |
| Pseudocode | No | The paper contains mathematical formulations and architecture diagrams, but no structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at: https://github.com/weiguoPian/MetaTPTrans. |
| Open Datasets | Yes | We conduct our experiments on the Code Search Net (Husain et al. 2019) dataset. |
| Dataset Splits | No | The paper mentions using a 'validation set' for visualization, but it does not provide specific details on the dataset splits (e.g., percentages or absolute counts for training, validation, and testing) within the main body of the paper. |
| Hardware Specification | Yes | We train our models for 10 and 40 epochs for the code summarization and code completion tasks, respectively on 4 Tesla V100 GPUs with batch size of 128 and dropout of 0.2. |
| Software Dependencies | No | The paper mentions using bidirectional-GRU and Adam optimizer, but does not provide specific version numbers for software libraries or frameworks used in the implementation. |
| Experiment Setup | Yes | For both tasks, following Peng et al. (2021), we set the embedding sizes of the word, path node, and hidden size of the Transformer to 512, 64, and 1024, respectively. A linear layer projects the word embedding into the size of the hidden layer of the Transformer. We use one bidirectional-GRU (Cho et al. 2014) layer of size 64 to encode the paths, and concatenate the final states of both directions as output. We use the Adam (Kingma and Ba 2015) optimizer with a learning rate of 1e 4. We train our models for 10 and 40 epochs for the code summarization and code completion tasks, respectively on 4 Tesla V100 GPUs with batch size of 128 and dropout of 0.2. For the Base Learner in the code summarization task, we use the same hyperparameters setting of TPTrans (Peng et al. 2021) for a fair comparison. Specifically, we set the number of encoder and decoder layers to 3 and 8, the number of attention heads to 3, and the dimension of the feed-forward layer to 4096. In the Meta Learner, the dimension of the language type embedding (d T ) and its projection (d P ) are set to 1024 and 2048, respectively. Following Zügner et al. (2021); Peng et al. (2021), we add the pointer network (Vinyals, Fortunato, and Jaitly 2015) to the decoder. For the code completion task, we set the number of encoder layers, number of heads, and the dimension of feed-forward layers to 5, 8 and 2048 respectively for all the baselines and our approaches. In the Meta Learner, we set both the dimension of the language type embedding (d T ) and its projection (d P ) to 512. |