Conducting Neuroscience to Guide the Development of AI

Authors: Jeffrey Siskind

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental First, f MRI decoding of the brain activity of subjects watching video clips yields higher accuracy than state-of-the-art computer-vision approaches to activity recognition. Second, novel methods are presented that decode aggregate representations of complex visual stimuli by decoding their independent constituents. Third, cross-modal studies demonstrate the ability to decode the brain activity induced in subjects watching video stimuli when trained on the brain activity induced in subjects seeing text or hearing speech stimuli and vice versa. Fourth, the time course of brain processing while watching video stimuli is probed with scanning that trades off the amount of the brain scanned for the frequency at which it is scanned.
Researcher Affiliation Academia Jeffrey Mark Siskind Purdue University School of Electrical and Computer Engineering 465 Northwestern Ave. West Lafayette, IN 47907-2035 USA qobi@purdue.edu
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code for the methodology described. It references external works and their methods but does not state that the authors' own code is publicly available.
Open Datasets Yes For this study, we used a subset of Hollywood 2 (Marszałek, Laptev, and Schmid 2009), a dataset of movie clips with 12 classes (Answer Phone, Drive Car, Eat, Fight Person, Get Out Car, Hand Shake, Hug Person, Kiss, Run, Sit Down, Sit Up, and Stand Up) that is used within the CV community to evaluate performance of action-recognition methods.
Dataset Splits Yes We performed both within-subject and cross-subject train and test, employing leave-1-run-out and leave-1-run-and-1-subject-out cross validation.
Hardware Specification No Imaging used a 3T GE Signa HDx scanner (Waukesha, Wisconsin) with a GE 8-channel head coil array for the experiments that employed audio stimuli and a Nova Medical (Wilmington, Massachusetts) 16-channel head coil array for all other experiments. This describes the fMRI scanning hardware, not the computing hardware (CPU/GPU, etc.) used for data analysis or model training.
Software Dependencies No Standard techniques (AFNI; Cox 1996) were employed to process the f MRI data, ultimately reducing the 143,360 voxels in each scan to a 4,000 element vector for within-subject analyses and 12,000 for cross-subject. AFNI is mentioned, but no specific version number is provided for it or any other software used in the analysis.
Experiment Setup Yes Our corpus consisted of 169 2.5s video clips, covering 6 classes carry, dig, hold, pick up, put down, and walk. We adopted a rapid event-related experiment design (Just et al. 2010). Each of 8 runs for 8 subjects contained 48 stimulus presentations. A single brain volume was captured for each presentation. [...] Each brain volume consisted of 64 64 35 voxels of dimension 3.125mm 3.125mm 3.000mm. Standard techniques (AFNI; Cox 1996) were employed to process the f MRI data, ultimately reducing the 143,360 voxels in each scan to a 4,000 element vector for within-subject analyses and 12,000 for cross-subject. Such vectors constituted samples for training and testing a linear SVM classifier that employed Linear Discriminant Dimensionality Reduction (Gu, Li, and Han 2011).