Learning Multimodal Word Representation via Dynamic Fusion Methods
Authors: Shaonan Wang, Jiajun Zhang, Chengqing Zong
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The extensive experiments have demonstrated that the proposed methods outperform strong unimodal baselines and state-of-the-art multimodal models. |
| Researcher Affiliation | Academia | National Laboratory of Pattern Recognition, CASIA, Beijing, China 2 University of Chinese Academy of Sciences, Beijing, China 3 CAS Center for Excellence in Brain Science and Intelligence Technology, Shanghai, China |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The data and code for training and evaluation can be found at: https://github.com/wangshaonan/dynamicFusion |
| Open Datasets | Yes | Datasets We use 300-dimensional Glo Ve vectors3 which are trained on the Common Crawl corpus... Our source of visual vectors are collected from Image Net (Russakovsky et al. 2015)... The training dataset are selected from about 20,000 word association pairs... The dataset is collected by (De Deyne, Perfors, and Navarro 2016) and can be found at: https://simondedeyne.me/data. |
| Dataset Splits | Yes | Model hyper-parameters are tuned by 5-fold cross validation (20% of data for testing and 80% for training)... We use the remaining word association pairs as the development dataset (word pairs together with their association scores). |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions software like Theano, Lasagne, Adagrad, and sklearn, but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | We test the initial learning rate over {0.05, 0.01, 0.5, 0.1}, set batch size to 25, and train the model for 5 epochs. We set the initial parameters in three gates to 1.0 and select the best parameters on the development set... In Ridge model, the optimal regularization parameter is 0.6. The Mapping model is trained with SGD for maximum 100 epochs with early stopping, and the optimal learning rate is 0.001. |