Interaction Considerations in Learning from Humans

Authors: Pallavi Koppol, Henny Admoni, Reid Simmons

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We then run a user study across two task domains. Our findings show that Evaluating interactions are more cognitively loading and less usable than the others, and Categorizing and Showing interactions are the least cognitively loading and most usable.
Researcher Affiliation Academia Carnegie Mellon University {pkoppol, hadmoni, rsimmons}@andrew.cmu.edu
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statements or links indicating that open-source code for the described methodology is available.
Open Datasets Yes We used 20 images from Pascal VOC 2012 [Everingham et al., 2010]... Captions to be evaluated were generated by a Keras Inception V3 [Szegedy et al., 2016] model trained on Image Net [Deng et al., 2009].
Dataset Splits No The paper mentions using Pascal VOC 2012 and ImageNet datasets but does not provide specific details on how these datasets were split into training, validation, or test sets for their user study or related model generation.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments.
Software Dependencies No The paper mentions that "Captions to be evaluated were generated by a Keras Inception V3" but does not specify version numbers for Keras, Inception V3, or any other software dependencies.
Experiment Setup Yes We designed a mixed-design user study to find empirical differences in cognitive load and usability between interaction types. Our within-subjects independent variable, interaction type, had four levels: Showing, Categorizing, Sorting, and Evaluating. Our between-subjects independent variable, task domain, had two levels: Sequential Decision Making (henceforth SDM), and Classification. To enable comparisons between interaction types, we selected similarly complex examples from each cluster and minimized presentation differences. We also standardized the interaction interface (e.g. the number of buttons, duration of tasks, available controls) as much as possible to minimize their impact on user attitudes.