Predicting Latent Narrative Mood Using Audio and Physiologic Data

Authors: Tuka AlHanai, Mohammad Ghassemi

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this study we utilized a combination of auditory, text, and physiological signals to predict the mood (happy or sad) of 31 narrations from subjects engaged in personal story-telling. We extracted 386 audio and 222 physiological features (using the Samsung Simband) from the data. A subset of 4 audio, 1 text, and 5 physiologic features were identified using Sequential Forward Selection (SFS) for inclusion in a Neural Network (NN). ... We evaluated our model s performance using leave-one-subject-out crossvalidation and compared the performance to 20 baseline models and a NN with all features included in the input layer.
Researcher Affiliation Academia Tuka Al Hanai and Mohammad Mahdi Ghassemi* Massachusetts Institute of Technology, Cambridge MA 02139, USA tuka@mit.edu, ghassemi@mit.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets No The paper states 'We present a novel multi-modal dataset containing audio, physiologic, and text transcriptions from 31 narrative conversations.' but does not provide access information (link, DOI, or citation for public access).
Dataset Splits Yes To ensure the robustness of the identified features, the forward selection algorithm was performed on ten folds of our dataset (90% training, 10% validation) and a feature was marked for inclusion in our models only if it was selected in 5 or more of the folds.
Hardware Specification No The paper mentions data collection devices (Samsung Simband, Apple iPhone 5S) but does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments or training the models.
Software Dependencies No The paper mentions software like 'open SMILE Toolkit' and 'Senti Word Net Lexicon' but does not specify their version numbers.
Experiment Setup Yes We optimized both the network topology, and the location of our selected features within the topology. More specifically, we trained all possible configurations of a NN with a number of hidden layers between 0 and 2 (where 0 hidden layers corresponds to a logistic regression). ... The optimal topology was a two hidden-layer network with six nodes in the first layer and three nodes in the second layer.