Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Pyramid Attention For Source Code Summarization
Authors: Lei Chai, Ming LI
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated it on two source code summarization benchmarks where it surpasses the prior works and achieves new state-of-the-art results. And ablation studies are conducted to show the efficiency of the proposed method. |
| Researcher Affiliation | Academia | Lei Chai and Ming Li National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China EMAIL |
| Pseudocode | No | The paper describes its methods in text and uses diagrams, but it does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | Yes | Our code and data are available at https://github.com/leichainju/pa-former. |
| Open Datasets | Yes | To demonstrate the effectiveness of the proposed method, we conduct experiments on two widely-used and well-developed java datasets: EMSE-Deep Com2 [11] which is collected from Git Hubs Java repositories and Fun Com3 [14] which has 2 million java methodcomment pairs. 2https://github.com/xing-hu/EMSE-Deep Com 3http://leclair.tech/data/funcom/ |
| Dataset Splits | No | Table 1 provides '#train' and '#test' statistics for the datasets, but there is no explicit mention of a 'validation' dataset split with specific numbers or percentages. |
| Hardware Specification | Yes | All models are trained using NVIDIA Tesla A100 GPUs with a batch size of 64. |
| Software Dependencies | No | The paper mentions using the 'Tree-sitter' tool and a 'Py Torch-based' framework, but it does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | For fair comparisons, all the Transformer-based models use the default Transformer configurations with embedding dimension as 512, feedforward dimension as 2048, head number as 8, and layer number for encoder/decoder as 6 and all RNN-based models use the hidden dimension with 512... All models are trained using NVIDIA Tesla A100 GPUs with a batch size of 64. We train all baselines including our models using Adam W optimizer with a multi_step learning rate scheduler, and set the initial learning rate to 0.0002 and 0.003 for Transformer-based and RNNbased models, respectively. |