MuMu: Cooperative Multitask Learning-Based Guided Multimodal Fusion
Authors: Md Mofijul Islam, Tariq Iqbal1043-1051
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated Mu Mu by comparing its performance to state-of-the-art multimodal HAR approaches on three activity datasets. Our extensive experimental results suggest that Mu Mu outperforms all the evaluated approaches across all three datasets. |
| Researcher Affiliation | Academia | Md Mofijul Islam, Tariq Iqbal School of Engineering and Applied Science, University of Virginia {mi8uu,tiqbal}@virginia.edu |
| Pseudocode | No | The paper describes the system architecture and components textually but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a specific repository link or an explicit statement of code release) for its methodology. |
| Open Datasets | Yes | We evaluated the performance of our proposed approach, Mu Mu, by applying it on three multimodal activity datasets: UCSD-MIT (Kubota et al. 2019), UTD-MHD (Chen, Jafari, and Kehtarnavaz 2015) and MMAct (Kong et al. 2019). |
| Dataset Splits | Yes | For MMAct dataset, we followed originally proposed crosssubject and cross-session evaluation settings and reported F1-scores (Tables 1 & 2). For UTD-MHAD and UCSD-MIT datasets, we followed leaveone-subject-out cross-validation and reported top-1 accuracies (Tables 4 & 3). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments. It only states 'For more implementation and training procedure details, please check the supplementary materials.' without specifics in the main text. |
| Software Dependencies | No | The paper mentions models like 'Res Net-50' and 'Co-occurrence approach' but does not specify software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1). |
| Experiment Setup | Yes | We segmented the data from visual modalities (RGB and depth) with a window size of 1 and a stride of 3. For the data from other sensor modalities, we used a window size of 5 and a stride of 5. The unimodal feature of each modality is encoded to 128 sized feature embedding. We used two fully connected layers with Re-LU activation after the first layer for activity-group classification in auxiliary task learning. We used similar task learning architecture for the activity classification in target task learning. |