SMIL: Multimodal Learning with Severely Missing Modality

Authors: Mengmeng Ma, Jian Ren, Long Zhao, Sergey Tulyakov, Cathy Wu, Xi Peng2302-2310

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate our idea, we conduct a series of experiments on three popular benchmarks: MM-IMDb, CMU-MOSI, and av MNIST. The results prove the state-of-the-art performance of SMIL over existing methods and generative baselines including autoencoders and generative adversarial networks.
Researcher Affiliation Collaboration Mengmeng Ma1, Jian Ren2, Long Zhao3, Sergey Tulyakov2, Cathy Wu1, Xi Peng1 1 University of Delaware, 2 Snap Inc., 3 Rutgers University {mengma, wuc, xipeng}@udel.edu, {jren, stulyakov}@snap.com, lz311@cs.rutgers.edu
Pseudocode Yes Algorithm 1: Bayesian Meta-Learning Framework.
Open Source Code Yes Our code is available at https://github.com/mengmenm/SMIL
Open Datasets Yes The Multimodal IMDb (MM-IMDb) (Arevalo et al. 2017) ... CMU Multimodal Opinion Sentiment Intensity (CMUMOSI) (Zadeh et al. 2016) ... Audiovision-MNIST (av MNIST) (Vielzeuf et al. 2018) ... Free Spoken Digits Dataset 2 containing raw 1, 500 audios. (https://github.com/Jakobovski/free-spoken-digit-dataset)
Dataset Splits Yes For MM-IMDb dataset, we follow the training and validation splits provided in the previous work (Vielzeuf et al. 2018). ... For CMU-MOSI, There are 1, 284 segments in the training set, 229 in the validation set, and 686 in the test set. ... For av MNIST dataset, We randomly select 70% data for training and use the rest for validation.
Hardware Specification No The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory) used to run its experiments.
Software Dependencies No The paper mentions optimizers (Adam) and network architectures (LSTM, LeNet-5), but does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes CMU-MOSI. We use Adam (Kingma and Ba 2014) optimizer with a batch size of 32 and train the networks for 5, 000 iterations with a learning rate of 10 4 for both innerloop and outer-loop of meta-learning. ... MM-IMDB. We apply Adam optimizer with a batch size of 128. We train the models for 10, 000 iteration with a learning rate of 10 4 for inner-loop and 10 3 fro outer-loop. ... For the training process, we use Adam optimizer with a batch size of 64 and train the networks for 15, 000 iterations with a learning rate of 10 3 for both innerand outerloop of meta-learning.