reproducibilityindex.ai

MuMu: Cooperative Multitask Learning-Based Guided Multimodal Fusion

Authors: Md Mofijul Islam, Tariq Iqbal1043-1051

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated Mu Mu by comparing its performance to state-of-the-art multimodal HAR approaches on three activity datasets. Our extensive experimental results suggest that Mu Mu outperforms all the evaluated approaches across all three datasets.
Researcher Affiliation	Academia	Md Mofijul Islam, Tariq Iqbal School of Engineering and Applied Science, University of Virginia {mi8uu,tiqbal}@virginia.edu
Pseudocode	No	The paper describes the system architecture and components textually but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access information (e.g., a specific repository link or an explicit statement of code release) for its methodology.
Open Datasets	Yes	We evaluated the performance of our proposed approach, Mu Mu, by applying it on three multimodal activity datasets: UCSD-MIT (Kubota et al. 2019), UTD-MHD (Chen, Jafari, and Kehtarnavaz 2015) and MMAct (Kong et al. 2019).
Dataset Splits	Yes	For MMAct dataset, we followed originally proposed crosssubject and cross-session evaluation settings and reported F1-scores (Tables 1 & 2). For UTD-MHAD and UCSD-MIT datasets, we followed leaveone-subject-out cross-validation and reported top-1 accuracies (Tables 4 & 3).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. It only states 'For more implementation and training procedure details, please check the supplementary materials.' without specifics in the main text.
Software Dependencies	No	The paper mentions models like 'Res Net-50' and 'Co-occurrence approach' but does not specify software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1).
Experiment Setup	Yes	We segmented the data from visual modalities (RGB and depth) with a window size of 1 and a stride of 3. For the data from other sensor modalities, we used a window size of 5 and a stride of 5. The unimodal feature of each modality is encoded to 128 sized feature embedding. We used two fully connected layers with Re-LU activation after the first layer for activity-group classification in auxiliary task learning. We used similar task learning architecture for the activity classification in target task learning.