Learning Compositional Sparse Models of Bimodal Percepts

Authors: Suren Kumar, Vikas Dhiman, Jason Corso

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To test our model, we have acquired a new bimodal dataset comprising images and spoken utterances of colored shapes in a tabletop setup. Our experiments demonstrate the benefits of explicitly leveraging compositionality in both quantitative and human evaluation studies. We perform rigorous qualitative and quantitative evaluation to test generalization and reproduction abilities of the paired sparse and compositional sparse models.
Researcher Affiliation Academia Suren Kumar and Vikas Dhiman and Jason J. Corso Computer Science and Engineering State University of New York at Buffalo, NY
Pseudocode No The paper describes the models and optimization problems mathematically (Eqs. 1-7) but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper mentions using 'the open source sparse coding package SPAMS (Mairal et al. 2010)' but does not provide a link or statement about releasing the source code for their own methodology.
Open Datasets No The paper states, 'We acquired a new dataset of shapes and colors with 156 different examples (Table 2)' but does not provide concrete access information (e.g., specific link, DOI, repository name, or formal citation with authors/year) for this dataset to be publicly available.
Dataset Splits Yes We perform a 3-fold cross-validation study to assess this retrieval performance by dividing the dataset into 3 parts, using 2 parts for training and remaining part for testing (and then permuting the sets).
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions 'the open source sparse coding package SPAMS (Mairal et al. 2010)' but does not specify its version number or any other software dependencies with version information.
Experiment Setup Yes We extract 260 dimensional audio features from selected 20 audio frames, 20 fourier harmonics, 3 dimensional color feature and fix λ = 0.15 for all of the experiments.