Showing versus doing: Teaching by demonstration
Authors: Mark K. Ho, Michael Littman, James MacGlashan, Fiery Cushman, Joseph L. Austerweil
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In two experiments, we show that human participants modify their teaching behavior consistent with the predictions of our model. Further, we show that even standard IRL algorithms benefit when learning from showing versus doing. |
| Researcher Affiliation | Academia | Mark K Ho Department of Cognitive, Linguistic, and Psychological Sciences Brown University Providence, RI 02912 mark_ho@brown.edu Michael L. Littman Department of Computer Science Brown University Providence, RI 02912 mlittman@cs.brown.edu James Mac Glashan Department of Computer Science Brown University Providence, RI 02912 james_macglashan@brown.edu Fiery Cushman Department of Psychology Harvard University Cambridge, MA 02138 cushman@fas.harvard.edu Joseph L. Austerweil Department of Psychology University of Wisconsin-Madison Madison, WI 53706 austerweil@wisc.edu |
| Pseudocode | Yes | Algorithm 1 Pedagogical Trajectory Algorithm |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the methodology described within it. |
| Open Datasets | No | The paper describes experiments with human participants whose trajectories form the data. It does not mention using or providing access to a publicly available dataset for training models or analysis. |
| Dataset Splits | No | The paper does not provide specific information about training, validation, or test dataset splits for reproducibility. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments. |
| Software Dependencies | No | The paper describes algorithms and models (e.g., Maximum Likelihood IRL) but does not list specific software dependencies with version numbers, such as Python versions or library versions. |
| Experiment Setup | Yes | Model trajectories are the two with the highest probability (λ = 2, α = 1.0, pmin = 10 6, lmax = 4).Sixty Amazon Mechanical Turk participants performed the task in Figure 1.Safe tiles were worth 0 points, dangerous tiles were worth -2 points, and the terminal goal tile was worth 5 points. They also won an additional 5 points for each round completed for a total of 10 points. Each point was worth 2 cents of bonus.We constrained non-goal feature weights to be non-positive. |