reproducibilityindex.ai

GENSYNTH: Synthesizing Datalog Programs without Language Bias

Authors: Jonathan Mendelson, Aaditya Naik, Mukund Raghothaman, Mayur Naik6444-6453

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we experimentally evaluate GENSYNTH with respect to the following criteria: 1. Effectiveness: How does GENSYNTH compare to existing approaches that use different kinds of language bias? 2. Generality: How does GENSYNTH perform on diverse tasks compared to a state-of-the-art approach? 3. Robustness: How does GENSYNTH perform on noisy data compared to a state-of-the-art approach? 4. Scalability: How does GENSYNTH scale with the size of the data and the amount of available parallelism? All experiments were run on a Ubuntu 18.04 server with an 18 core Intel Xeon 3 GHz processor and 394 GB memory.
Researcher Affiliation	Academia	Jonathan Mendelson1 , Aaditya Naik1*, Mukund Raghothaman2, Mayur Naik1 1 University of Pennsylvania 2 University of Southern California {jonom,asnaik}@seas.upenn.edu,raghotha@usc.edu,mhnaik}@seas.upenn.edu
Pseudocode	Yes	Algorithm 1 Create Clause(Rin, rhead). Algorithm 2 Accrete(R, I, O+, O , f T ). Algorithm 3 Reduce(R, I, O+, O , f T , bp).
Open Source Code	Yes	We introduce GENSYNTH1, a template-free end-to-end Datalog synthesis tool. 1Available at https://jonomendelson.github.io/gensynth/
Open Datasets	Yes	We compare GENSYNTH and Pro Synth on 42 tasks from three different domains: 17 knowledge discovery tasks frequently used in the artiﬁcial intelligence and database literature, 11 common program analysis tasks for statically reasoning about C or Java programs, and 15 relational query tasks from (Wang, Cheung, and Bodik 2017) based on Stack Overﬂow posts and textbook examples. We use the Countries benchmark as it is the most difﬁcult of the NTP benchmarks (Rockt aschel and Riedel 2017). ... This data was obtained from NTP s Git Hub repository, and we use the same sets of present and missing data for both tools.
Dataset Splits	Yes	The countries are split into 198 training, 24 validation and 24 testing countries.
Hardware Specification	Yes	All experiments were run on a Ubuntu 18.04 server with an 18 core Intel Xeon 3 GHz processor and 394 GB memory. We additionally use an Nvidia 2080 Ti GPU for NTP s scalable implementation (Minervini et al. 2018).
Software Dependencies	No	The paper mentions using 'Soufflé (Jordan, Scholz, and Suboti c 2016)' as the Datalog interpreter, but does not provide a specific version number for it or any other key software dependencies.
Experiment Setup	Yes	We use the following hyperparameters in our experiments: 1. Number of populations: b = 32. 2. Population size: c = 50. 3. Selection ratio: s = 0.2. 4. Number of mutations in each step: n B, where B = Bin(n, p) is a binomial distribution with n = 15c1c2 and p = 0.3. Both c1 and c2 are sampled uniformly at random between 0 and 1.