Sampling for Bayesian Program Learning
Authors: Kevin Ellis, Armando Solar-Lezama, Josh Tenenbaum
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We give theoretical guarantees on how well the samples approximate the true posterior, and have empirical results showing the algorithm is efficient in practice, evaluating our approach on 22 program learning problems in the domains of text editing and computer-aided programming. |
| Researcher Affiliation | Academia | Kevin Ellis Brain and Cognitive Sciences MIT ellisk@mit.edu Armando Solar-Lezama CSAIL MIT asolar@csail.mit.edu Joshua B. Tenenbaum Brain and Cognitive Sciences MIT jbt@mit.edu |
| Pseudocode | Yes | Algorithm 1 PROGRAMSAMPLE |
| Open Source Code | No | The paper mentions and provides a link for Crypto Mini SAT [17] ('http://www.msoos.org/documentation/cryptominisat/'), which is a third-party tool used by the authors. However, it does not provide any statement or link for the open-source code of their own described methodology (PROGRAMSAMPLE). |
| Open Datasets | Yes | Because Flash Fill s training set is not yet public, we drew text editing problems from [19] and adapted them to our subset of Flash Fill, giving 19 problems, each with 5 training examples. |
| Dataset Splits | Yes | The supplement contains these text edit problems. We are interested both in the ability of the learner to generalize and in PROGRAMSAMPLE s ability to generate samples quickly. Table 1 shows the average time per sampling attempt using PROGRAMSAMPLE, which is on the order of a minute. These text edit problems come from distributions with extremely high tilt: often the smallest program is only tens of bits long, but the program space contains (implausible) solutions with over 100 bits. By putting d to |x | n we eliminate the tilt correction and recover a variant of the approaches in [7]. This baseline does not produce any samples for any of our text edit problems in under an hour.4 Other baselines also failed to produce samples in a reasonable amount of time (see supplement). For example, pure rejection sampling (drawing from the prior) is also infeasible, with consistent programs having prior probability 2 50 in some cases. |
| Hardware Specification | No | The paper does not provide specific hardware details such as CPU or GPU models, memory, or cloud computing instance types used for running the experiments. It only mentions using 'Crypto Mini SAT [17]', which is software. |
| Software Dependencies | No | The paper mentions using 'Crypto Mini SAT [17]' but does not specify a version number for this or any other software dependency, which is required for reproducibility. |
| Experiment Setup | Yes | Figure 7 shows this comparison for a counting problem with = 3 and γ = 4. |