MetaTPTrans: A Meta Learning Approach for Multilingual Code Representation Learning

Authors: Weiguo Pian, Hanyu Peng, Xunzhu Tang, Tiezhu Sun, Haoye Tian, Andrew Habib, Jacques Klein, Tegawendé F. Bissyandé

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on the code summarization and code completion tasks to verify the effectiveness of our approach. The results demonstrate the superiority of our approach with significant improvements on state-of-the-art baselines.
Researcher Affiliation Collaboration 1 Sn T, University of Luxembourg, Luxembourg 2 Baidu Inc., Beijing, China 3 CITADEL, Université Virtuelle du Burkina Faso
Pseudocode No The paper contains mathematical formulations and architecture diagrams, but no structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at: https://github.com/weiguoPian/MetaTPTrans.
Open Datasets Yes We conduct our experiments on the Code Search Net (Husain et al. 2019) dataset.
Dataset Splits No The paper mentions using a 'validation set' for visualization, but it does not provide specific details on the dataset splits (e.g., percentages or absolute counts for training, validation, and testing) within the main body of the paper.
Hardware Specification Yes We train our models for 10 and 40 epochs for the code summarization and code completion tasks, respectively on 4 Tesla V100 GPUs with batch size of 128 and dropout of 0.2.
Software Dependencies No The paper mentions using bidirectional-GRU and Adam optimizer, but does not provide specific version numbers for software libraries or frameworks used in the implementation.
Experiment Setup Yes For both tasks, following Peng et al. (2021), we set the embedding sizes of the word, path node, and hidden size of the Transformer to 512, 64, and 1024, respectively. A linear layer projects the word embedding into the size of the hidden layer of the Transformer. We use one bidirectional-GRU (Cho et al. 2014) layer of size 64 to encode the paths, and concatenate the final states of both directions as output. We use the Adam (Kingma and Ba 2015) optimizer with a learning rate of 1e 4. We train our models for 10 and 40 epochs for the code summarization and code completion tasks, respectively on 4 Tesla V100 GPUs with batch size of 128 and dropout of 0.2. For the Base Learner in the code summarization task, we use the same hyperparameters setting of TPTrans (Peng et al. 2021) for a fair comparison. Specifically, we set the number of encoder and decoder layers to 3 and 8, the number of attention heads to 3, and the dimension of the feed-forward layer to 4096. In the Meta Learner, the dimension of the language type embedding (d T ) and its projection (d P ) are set to 1024 and 2048, respectively. Following Zügner et al. (2021); Peng et al. (2021), we add the pointer network (Vinyals, Fortunato, and Jaitly 2015) to the decoder. For the code completion task, we set the number of encoder layers, number of heads, and the dimension of feed-forward layers to 5, 8 and 2048 respectively for all the baselines and our approaches. In the Meta Learner, we set both the dimension of the language type embedding (d T ) and its projection (d P ) to 512.