Deep Mutual Information Maximin for Cross-Modal Clustering
Authors: Yiqiao Mao, Xiaoqiang Yan, Qiang Guo, Yangdong Ye8893-8901
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results demonstrate the superiority of DMIM method over the state-of-the-art cross-modal clustering methods on IAPR-TC12, ESP-Game, MIRFlickr and NUSWide datasets. |
| Researcher Affiliation | Academia | School of Information Engineering, Zhengzhou University, Zhengzhou, China ieyqmao@gs.zzu.edu.cn, iexqyan@zzu.edu.cn, ieqguo@gs.zzu.edu.cn, ieydye@zzu.edu.cn |
| Pseudocode | Yes | Algorithm 1 The DMIM Algorithm |
| Open Source Code | No | The paper does not include any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | The datasets used in our experiments include: 1) IAPR-TC12 (Michael Grubinger 2006): ... 2) ESP-Game (von Ahn and Dabbish 2004): ... 3) MIRFlickr (Huiskes and Lew 2008): ... 4) NUS-Wide (Chua et al. 2009): |
| Dataset Splits | No | The paper mentions using IAPR-TC12, ESP-Game, MIRFlickr, and NUS-Wide datasets and their total sizes, but it does not specify how these datasets were split into training, validation, or test sets for experiments. |
| Hardware Specification | Yes | We conduct all the experiments on the platform of Windows 10 with NVIDIA 1060 Graphics Processing Units (GPUs) and 32G memory size. |
| Software Dependencies | Yes | We implement our proposed DMIM method and deep clustering baselines with the public toolbox of Py Torch. Other traditional comparison baselines are conducted on Matlab 2016a. |
| Experiment Setup | Yes | In the proposed DMIM method, the multi-modal shared encoder is composed of two fully connected layers. Each fully connected layer is followed by a Batch Norm layer and a Re LU layer... The clustering layer and overclustering layer also adopt fully connected layer, in which the number of hidden nodes are set as |Y | and 10 |Y |, respectively. At the beginning of training process, all parameters in the model are initialized randomly. ... We use Adam (Kingma and Ba 2015) optimizer with the learning rate 5 10 5. |