reproducibilityindex.ai

Generating Pragmatic Examples to Train Neural Program Synthesizers

Authors: Saujas Vaduguru, Daniel Fried, Yewen Pu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate PRAX on the challenging task of synthesizing regular expressions from example strings, and find that our method (1) outperforms models trained without choosing pragmatic examples by 23% (a 51% relative increase) (2) matches the performance of supervised learning on a dataset of pragmatic examples provided by humans, despite using no human data in training.
Researcher Affiliation	Collaboration	Saujas Vaduguru Carnegie Mellon University svadugur@cs.cmu.edu Daniel Fried Carnegie Mellon University dfried@cs.cmu.edu Yewen Pu Autodesk Research yewen.pu@autodesk.com
Pseudocode	Yes	The full algorithm is detailed in Appendix C. Algorithm 1 Outer training loop... Algorithm 2 Approximate RSA inference
Open Source Code	Yes	Our code and data are available at https://github.com/saujasv/generating-pragmatic-examples.
Open Datasets	Yes	Our code and data are available at https://github.com/saujasv/generating-pragmatic-examples.
Dataset Splits	Yes	We collect a total of 440 program-specification pairs. We sample a small subset of 40 pairs that received 2 correct verifications as a validation set for model selection. We use the other 400 pairs as a training set to finetune the Lθ0 models on human-provided informative examples, obtaining HFT
Hardware Specification	No	The paper does not provide specific details about the hardware used for experiments, such as GPU or CPU models.
Software Dependencies	No	The paper mentions "By T5-small models", "greenery Python library", "rstr Python package", and "Adam W optimizer" but does not provide specific version numbers for these software components.
Experiment Setup	Yes	We train the base models on 100,000 programs. For each program, we randomly sample 3 specifications. The length of each specification is chosen to be an integer between 0 and 15, uniformly at random. The models are trained for 1 epoch, using the Adam W optimizer with a learning rate of 5 10 5, with the learning rate warmed up over the first 10% of training steps, and then decayed linearly. The batch size is set to 32. ... We train for for Rmax = 20 rounds with k = 1024 programs per round, generating specifications with up to Nutterances = 10 examples. ...At each round we update the models using the Adam W optimizer for 1 epoch, using a learning rate of 5 10 5. The batch size is set to 32.