Learning Graph Structure With A Finite-State Automaton Layer
Authors: Daniel Johnson, Hugo Larochelle, Daniel Tarlow
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that this layer can find shortcuts in grid-world graphs and reproduce simple static analyses on Python programs. Additionally, we combine the GFSA layer with a larger graph-based model trained end-to-end on the variable misuse program understanding task, and find that using the GFSA layer leads to better performance than using hand-engineered semantic edges or other baseline methods for adding learned edge types. |
| Researcher Affiliation | Industry | Daniel D. Johnson, Hugo Larochelle, Daniel Tarlow Google Research {ddjohnson, hugolarochelle, dtarlow}@google.com |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | An implementation is available at https://github.com/google-research/google-research/ tree/master/gfsa. |
| Open Datasets | Yes | We first generate a synthetic dataset of Python programs by sampling from a probabilistic context-free grammar over a subset of Python. We then transform the corresponding ASTs into graphs, and compute the three edge types NEXTCONTROLFLOW, LASTREAD, and LASTWRITE... Following Hellendoorn et al. [19], we use a dataset of small code samples from a permissively-licenced subset of the ETH 150k Python dataset [33], where synthetic variable misuse bugs have been introduced in half of the examples by randomly replacing one of the identifiers with a different identifier in that program.4 https://github.com/google-research-datasets/great |
| Dataset Splits | No | Table 1 shows results of each of these models on the three edge classification tasks. We present results after training on a dataset of 100,000 examples as well as on a smaller dataset of only 100 examples, and report F1 scores at the best classification threshold; we choose the model with the best validation performance from a 32-job random hyperparameter search. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running experiments. |
| Software Dependencies | No | We implement the forward and backward passes using the automatic differentiation package JAX [8], which makes it straightforward to use implicit differentiation with an efficient matrix-vector product implementation that avoids materializing the full transition matrix Qn0 for each value of n0 (see appendix C for details). The paper mentions a software package (JAX) but does not provide version numbers for it or any other key software dependencies. |
| Experiment Setup | Yes | We use the focal-loss objective [29], a more stable variant of the cross-entropy loss for highly unbalanced classification problems, minimizing... We choose the model with the best validation performance from a 32-job random hyperparameter search... We consider two graph neural network architectures: either an eight-layer RAT model [42] or eight GGNN blocks [27] with two message passing iterations per block... |