How to talk so AI will learn: Instructions, descriptions, and autonomy

Authors: Theodore Sumers, Robert Hawkins, Mark K. Ho, Tom Griffiths, Dylan Hadfield-Menell

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our models with a behavioral experiment, demonstrating that (1) our speaker model predicts human behavior, and (2) our pragmatic listener successfully recovers humans reward functions.
Researcher Affiliation Academia Theodore R. Sumers Computer Science Princeton University sumers@princeton.edu Robert D. Hawkins Princeton Neuroscience Institute Princeton University rdhawkins@princeton.edu Mark K. Ho Computer Science Princeton University mho@princeton.edu Thomas L. Griffiths Computer Science, Psychology Princeton University tomg@princeton.edu Dylan Hadfield-Menell EECS, CSAIL MIT dhm@csail.mit.edu
Pseudocode No No pseudocode or algorithm blocks were found in the paper.
Open Source Code Yes Code and data are available at https://github.com/tsumers/how-to-talk.
Open Datasets Yes Code and data are available at https://github.com/tsumers/how-to-talk.
Dataset Splits No The paper describes calibrating model parameters (e.g., "To calibrate our pragmatic listeners, we tested βS1 [1, 10] and found that βS1 = 3 optimized Known H and Latent H listeners"), but does not explicitly provide training/validation/test splits for the human behavioral dataset collected in the experiment to enable reproduction of data partitioning.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running experiments or simulations.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes To calibrate our pragmatic listeners, we tested βS1 [1, 10] and found that βS1 = 3 optimized Known H and Latent H listeners (see Appendix B.3 for details)." and "we fix βL0 = 3 throughout this work".