How could Neural Networks understand Programs?
Authors: Dinglan Peng, Shuxin Zheng, Yatao Li, Guolin Ke, Di He, Tie-Yan Liu
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the performance of OSCAR on several semantics understanding tasks for programs in this section. We first perform our model on a practical and important software engineering task, i.e., binary diffing. After that, we evaluate the performance of OSCAR for high-level PL understanding on the algorithm classification task. Furthermore, as a pre-training method, we investigate the performance of OSCAR in zeroshot learning, where the parameters of OSCAR are fixed. Finally, we analyze the components of our model in the ablation study. |
| Researcher Affiliation | Collaboration | 1University of Science and Technology of China 2Microsoft Research Asia. |
| Pseudocode | No | The paper describes methods and steps but does not contain a clearly labeled pseudocode block or algorithm figure. |
| Open Source Code | Yes | Code and models are released at: https://github.com/pdlan/OSCAR. |
| Open Datasets | Yes | We conduct the experiments on POJ-104 dataset (Mou et al., 2016), which contains 104 algorithm problems that were submitted to an online judge system. We conduct the pre-training of OSCAR on a large corpus of real-world programs from publicly available open-source Git Hub repositories, which covers a broad range of disciplines from operating systems and compilers, to machine learning systems and linear algebra subprograms (Details in Appendix F.1). |
| Dataset Splits | No | The paper mentions using POJ-104 dataset and following the experimental setting of a previous paper (Cummins et al., 2020a;b), but it does not explicitly provide the specific training/validation/test dataset splits (percentages or sample counts) for its own experiments, nor for the large pre-training corpus. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using LLVM IR and that programs are compiled with GCC 7.5.0, but it does not provide specific version numbers for other key software components, libraries, or frameworks used for implementing and running the models (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | Unless otherwise specified, all experiments are conducted on a 12-layer OSCAR model which is composed sequentially of three token-level encoder layers, six instruction-level encoder layers, and three token-level encoder layers. We follow Ro BERTa-base (Liu et al., 2019) to set other model configurations (Details in Appendix B), e.g., the dimensionality of hidden representation d is set to 768. The total sequence length of Inst. encoder is set to 512, where the IR and Env. encoders each account for 256 instructions. We set K = 4 in our experiments. |