reproducibilityindex.ai

Program Synthesis with Pragmatic Communication

Authors: Yewen Pu, Kevin Ellis, Marta Kryven, Josh Tenenbaum, Armando Solar-Lezama

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In conducting a user study on Amazon Mechanical Turk, we find that naive end-users communicate more efficiently with a pragmatic program synthesizer compared to its literal variant. We conduct an user study to evaluate how well a naive end-user interacts with a pragmatic program synthesizer (L1) versus a non-pragmatic one (L0).
Researcher Affiliation	Academia	Yewen Pu MIT Kevin Ellis MIT Marta Kryven MIT Joshua B. Tenenbaum MIT Armando Solar-Lezama MIT
Pseudocode	No	The paper describes algorithms using mathematical formulations and descriptive text, but does not include structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	code : https://github.com/evanthebouncy/program_synthesis_pragmatics
Open Datasets	No	Stimuli were 10 representative renderings of program sampled from the DSL, capturing different concepts such as stripes vs checkered colour patterns and solid vs hollow ring shapes. While the DSL is defined, the specific set of 10 stimuli used in the user study are not explicitly made public with a link or citation to a dataset.
Dataset Splits	No	The paper describes a user study and does not mention explicit training, validation, or test dataset splits in terms of percentages, counts, or predefined splits.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions various software and frameworks but does not provide specific version numbers for the ancillary software dependencies used in its implementation or experiments.
Experiment Setup	Yes	The communication task. The subjects were told they are communicating with two robots, either white (L0) or blue (L1). The subjects were given a stimuli (a rendering), and were asked to make a robot recreate this pattern by providing the robots with few, strategically placed symbols on a scratch grid (set of examples). Each time the subject places a symbol, the robot guesses the most likely program given the examples, and display its guess as a rendering as feedback to the subject.