GENSYNTH: Synthesizing Datalog Programs without Language Bias
Authors: Jonathan Mendelson, Aaditya Naik, Mukund Raghothaman, Mayur Naik6444-6453
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we experimentally evaluate GENSYNTH with respect to the following criteria: 1. Effectiveness: How does GENSYNTH compare to existing approaches that use different kinds of language bias? 2. Generality: How does GENSYNTH perform on diverse tasks compared to a state-of-the-art approach? 3. Robustness: How does GENSYNTH perform on noisy data compared to a state-of-the-art approach? 4. Scalability: How does GENSYNTH scale with the size of the data and the amount of available parallelism? All experiments were run on a Ubuntu 18.04 server with an 18 core Intel Xeon 3 GHz processor and 394 GB memory. |
| Researcher Affiliation | Academia | Jonathan Mendelson1 , Aaditya Naik1*, Mukund Raghothaman2, Mayur Naik1 1 University of Pennsylvania 2 University of Southern California {jonom,asnaik}@seas.upenn.edu,raghotha@usc.edu,mhnaik}@seas.upenn.edu |
| Pseudocode | Yes | Algorithm 1 Create Clause(Rin, rhead). Algorithm 2 Accrete(R, I, O+, O , f T ). Algorithm 3 Reduce(R, I, O+, O , f T , bp). |
| Open Source Code | Yes | We introduce GENSYNTH1, a template-free end-to-end Datalog synthesis tool. 1Available at https://jonomendelson.github.io/gensynth/ |
| Open Datasets | Yes | We compare GENSYNTH and Pro Synth on 42 tasks from three different domains: 17 knowledge discovery tasks frequently used in the artificial intelligence and database literature, 11 common program analysis tasks for statically reasoning about C or Java programs, and 15 relational query tasks from (Wang, Cheung, and Bodik 2017) based on Stack Overflow posts and textbook examples. We use the Countries benchmark as it is the most difficult of the NTP benchmarks (Rockt aschel and Riedel 2017). ... This data was obtained from NTP s Git Hub repository, and we use the same sets of present and missing data for both tools. |
| Dataset Splits | Yes | The countries are split into 198 training, 24 validation and 24 testing countries. |
| Hardware Specification | Yes | All experiments were run on a Ubuntu 18.04 server with an 18 core Intel Xeon 3 GHz processor and 394 GB memory. We additionally use an Nvidia 2080 Ti GPU for NTP s scalable implementation (Minervini et al. 2018). |
| Software Dependencies | No | The paper mentions using 'Soufflé (Jordan, Scholz, and Suboti c 2016)' as the Datalog interpreter, but does not provide a specific version number for it or any other key software dependencies. |
| Experiment Setup | Yes | We use the following hyperparameters in our experiments: 1. Number of populations: b = 32. 2. Population size: c = 50. 3. Selection ratio: s = 0.2. 4. Number of mutations in each step: n B, where B = Bin(n, p) is a binomial distribution with n = 15c1c2 and p = 0.3. Both c1 and c2 are sampled uniformly at random between 0 and 1. |