reproducibilityindex.ai

Predicting Confusion in Information Visualization from Eye Tracking and Interaction Data

Authors: Sébastien Lallé, Cristina Conati, Giuseppe Carenini

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The data was collected during a user study with Value Chart, an interactive visualization to support preferential choices. We report very promising results based on Random Forest classifiers. In this paper, we investigate machine learning models to predict occurrences of confusion during the interaction with Value Chart, an interactive visualization to support multicriteria preferential choice.
Researcher Affiliation	Academia	Sébastien Lallé, Cristina Conati, Giuseppe Carenini The University of British Columbia, Vancouver B.C., Canada
Pseudocode	No	The paper describes the use of Random Forest classifiers but does not include any pseudocode or algorithm blocks.
Open Source Code	No	Data available at http://www.cs.ubc.ca/~lalles/IJCAI16.html
Open Datasets	Yes	The dataset1 used in this paper was collected from a user study using Value Chart, an interactive visualization for preferential choice [Conati et al. 2014]. 1 Data available at http://www.cs.ubc.ca/~lalles/IJCAI16.html
Dataset Splits	Yes	These classifiers are trained and evaluated with a process of 20-runs-10-folds nested crossvalidation, which includes two levels (inner and outer) of cross-validation.
Hardware Specification	No	While performing the tasks, the user s gaze was tracked with a Tobii T120, a non-intrusive eye-tracker embedded in the study computer monitor.
Software Dependencies	No	To build our classifiers, we use Random Forest tuned with 100 trees using the Caret package in R [Kuhn 2008].
Experiment Setup	Yes	To build our classifiers, we use Random Forest tuned with 100 trees using the Caret package in R [Kuhn 2008]. Specifically, using SMOTE we generated synthetic confusion trials based on k nearest neighbors in the minority class (we used the default value k=5). In our study, we over-sampled confusion trials by 200% (i.e., number of confusion trials is doubled) and 500%.