Learning Multimodal Data Augmentation in Feature Space
Authors: Zichang Liu, Zhiqiang Tang, Xingjian Shi, Aston Zhang, Mu Li, Anshumali Shrivastava, Andrew Gordon Wilson
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4 EXPERIMENTS We evaluate Le MDA over a diverse set of real-world multimodal datasets. We curate a list of public datasets covering image, text, numerical, and categorical inputs. Table 1 provides a summary of the source, statistic, and modality identity. We introduce baselines in Section 4.1, and describe experimental settings in Section 4.2 We provide the main evaluation result in Section 4.3. Finally, we investigate the effects of the consistency regularizer and the choices of augmentation model architecture in Section 4.4. |
| Researcher Affiliation | Collaboration | Zichang Liu Department of Computer Science Rice University zl71@rice.edu Zhiqiang Tang Amazon Web Services zqtang@amazon.com Xingjian Shi Amazon Web Services xjshi@amazon.com Aston Zhang Amazon Web Services astonz@amazon.com Mu Li Amazon Web Services mli@amazon.com Anshumali Shrivastava Department of Computer Science Rice University anshumali@rice.edu Andrew Gordon Wilson New York University Amazon Web Services andrewgw@cims.nyu.edu |
| Pseudocode | Yes | Algorithm 1 Le MDA Training |
| Open Source Code | Yes | 1Code is available at https://github.com/lzcemma/Le MDA/ |
| Open Datasets | Yes | We evaluate Le MDA over a diverse set of real-world multimodal datasets. We curate a list of public datasets covering image, text, numerical, and categorical inputs. Table 1 provides a summary of the source, statistic, and modality identity. Examples include SNLI-VE (Xie et al., 2019a). |
| Dataset Splits | No | Table 1 provides train and test set sizes (e.g., Hateful Memes: 7134 Train, 1784 Test) but does not explicitly state a validation split or its size. |
| Hardware Specification | Yes | Experiments were conducted on a server with 8 V100 GPU. |
| Software Dependencies | No | The paper mentions using 'pyTorch Autograd' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For Le MDA, we set the confidence threshold for consistency regularizer α as 0.5... In our main experiment, we use w1 = 0.0001, w2 = 0.1, w3 = 0.1 on all datasets except Melbourne Airbnb and SNLI-VE. On Melbourne Airbnb and SNLI-VE, we use w1 = 0.001, w2 = 0.1, w3 = 0.1. |