NS3: Neuro-symbolic Semantic Code Search

Authors: Shushan Arakelyan, Anna Hakhverdyan, Miltiadis Allamanis, Luis Garcia, Christophe Hauser, Xiang Ren

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our model NS3 (Neuro-Symbolic Semantic Search) to a number of baselines, including state-of-the-art semantic code retrieval methods, and evaluate on two datasets Code Search Net and Code Search and Question Answering. We demonstrate that our approach results in more precise code retrieval, and we study the effectiveness of our modular design when handling compositional queries. We evaluate our proposed NS3 model on two SCS datasets Code Search Net (CSN) [24] and Co SQA/Web Query Test [23]. Additionally, we experiment with a limited training set size of CSN of 10K and 5K examples. We find that NS3 provides large improvements upon baselines in all cases. Our experiments demonstrate that the resulting model is more sensitive to small, but semantically significant changes in the query, and is more likely to correctly recognize that a modified query no longer matches its code pair.
Researcher Affiliation Collaboration Shushan Arakelyan1, Anna Hakhverdyan2, Miltiadis Allamanis3 , Luis Garcia4, Christophe Hauser4 , Xiang Ren1 1University of Southern California, Department of Computer Science 2National Polytechnic University of Armenia 3Microsoft Research Cambridge 4USC Information Sciences Institute
Pseudocode No The paper does not contain any sections or figures explicitly labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code Yes Code and data are available at https://github.com/ShushanArakelyan/modular_code_search
Open Datasets Yes We conduct experiments on two datasets: Python portion of the Code Search Net (CSN) [24], and Co SQA [23].
Dataset Splits No We use early stopping with evaluation on unseen validation set for model selection during action module pretraining and endto-end training. The paper mentions using a 'validation set' but does not specify the exact percentages or counts for this split.
Hardware Specification No The paper does not specify any hardware details such as exact GPU/CPU models, memory, or cloud instance types used for running experiments. While it references CodeBERT, it does not state the hardware used for their own experiments.
Software Dependencies No The paper mentions software like 'RoBERTa model' and 'CodeBERT model' and the programming language 'Python', but it does not specify exact version numbers for these or any other libraries or frameworks used in their implementation.
Experiment Setup Yes The MLPs in entity discovery and action modules have 2 layers with input dimension of 768. We use dropout in these networks with rate 0.1. The learning rate for pretraining and end-to-end training phases was chosen from the range of 1e-6 to 6e-5. We use early stopping with evaluation on unseen validation set for model selection during action module pretraining and endto-end training. For entity discovery model selection we performed manual inspection of produced scores on unseen examples. For fine-tuning the Cu BERT, Code BERT and Graph Code BERT baselines we use the hyperparameters reported in their original papers. For Ro BERTa (code), we perform the search for learning rate during fine-tuning stage in the same interval as for our model. For model selection on baselines we also use early stopping.