reproducibilityindex.ai

Sampling for Bayesian Program Learning

Authors: Kevin Ellis, Armando Solar-Lezama, Josh Tenenbaum

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We give theoretical guarantees on how well the samples approximate the true posterior, and have empirical results showing the algorithm is efﬁcient in practice, evaluating our approach on 22 program learning problems in the domains of text editing and computer-aided programming.
Researcher Affiliation	Academia	Kevin Ellis Brain and Cognitive Sciences MIT ellisk@mit.edu Armando Solar-Lezama CSAIL MIT asolar@csail.mit.edu Joshua B. Tenenbaum Brain and Cognitive Sciences MIT jbt@mit.edu
Pseudocode	Yes	Algorithm 1 PROGRAMSAMPLE
Open Source Code	No	The paper mentions and provides a link for Crypto Mini SAT [17] ('http://www.msoos.org/documentation/cryptominisat/'), which is a third-party tool used by the authors. However, it does not provide any statement or link for the open-source code of their own described methodology (PROGRAMSAMPLE).
Open Datasets	Yes	Because Flash Fill s training set is not yet public, we drew text editing problems from [19] and adapted them to our subset of Flash Fill, giving 19 problems, each with 5 training examples.
Dataset Splits	Yes	The supplement contains these text edit problems. We are interested both in the ability of the learner to generalize and in PROGRAMSAMPLE s ability to generate samples quickly. Table 1 shows the average time per sampling attempt using PROGRAMSAMPLE, which is on the order of a minute. These text edit problems come from distributions with extremely high tilt: often the smallest program is only tens of bits long, but the program space contains (implausible) solutions with over 100 bits. By putting d to \|x \| n we eliminate the tilt correction and recover a variant of the approaches in [7]. This baseline does not produce any samples for any of our text edit problems in under an hour.4 Other baselines also failed to produce samples in a reasonable amount of time (see supplement). For example, pure rejection sampling (drawing from the prior) is also infeasible, with consistent programs having prior probability 2 50 in some cases.
Hardware Specification	No	The paper does not provide specific hardware details such as CPU or GPU models, memory, or cloud computing instance types used for running the experiments. It only mentions using 'Crypto Mini SAT [17]', which is software.
Software Dependencies	No	The paper mentions using 'Crypto Mini SAT [17]' but does not specify a version number for this or any other software dependency, which is required for reproducibility.
Experiment Setup	Yes	Figure 7 shows this comparison for a counting problem with = 3 and γ = 4.