reproducibilityindex.ai

Interaction Considerations in Learning from Humans

Authors: Pallavi Koppol, Henny Admoni, Reid Simmons

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then run a user study across two task domains. Our ﬁndings show that Evaluating interactions are more cognitively loading and less usable than the others, and Categorizing and Showing interactions are the least cognitively loading and most usable.
Researcher Affiliation	Academia	Carnegie Mellon University {pkoppol, hadmoni, rsimmons}@andrew.cmu.edu
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statements or links indicating that open-source code for the described methodology is available.
Open Datasets	Yes	We used 20 images from Pascal VOC 2012 [Everingham et al., 2010]... Captions to be evaluated were generated by a Keras Inception V3 [Szegedy et al., 2016] model trained on Image Net [Deng et al., 2009].
Dataset Splits	No	The paper mentions using Pascal VOC 2012 and ImageNet datasets but does not provide specific details on how these datasets were split into training, validation, or test sets for their user study or related model generation.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments.
Software Dependencies	No	The paper mentions that "Captions to be evaluated were generated by a Keras Inception V3" but does not specify version numbers for Keras, Inception V3, or any other software dependencies.
Experiment Setup	Yes	We designed a mixed-design user study to ﬁnd empirical differences in cognitive load and usability between interaction types. Our within-subjects independent variable, interaction type, had four levels: Showing, Categorizing, Sorting, and Evaluating. Our between-subjects independent variable, task domain, had two levels: Sequential Decision Making (henceforth SDM), and Classiﬁcation. To enable comparisons between interaction types, we selected similarly complex examples from each cluster and minimized presentation differences. We also standardized the interaction interface (e.g. the number of buttons, duration of tasks, available controls) as much as possible to minimize their impact on user attitudes.