Deep Mutual Information Maximin for Cross-Modal Clustering

Authors: Yiqiao Mao, Xiaoqiang Yan, Qiang Guo, Yangdong Ye8893-8901

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results demonstrate the superiority of DMIM method over the state-of-the-art cross-modal clustering methods on IAPR-TC12, ESP-Game, MIRFlickr and NUSWide datasets.
Researcher Affiliation Academia School of Information Engineering, Zhengzhou University, Zhengzhou, China ieyqmao@gs.zzu.edu.cn, iexqyan@zzu.edu.cn, ieqguo@gs.zzu.edu.cn, ieydye@zzu.edu.cn
Pseudocode Yes Algorithm 1 The DMIM Algorithm
Open Source Code No The paper does not include any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes The datasets used in our experiments include: 1) IAPR-TC12 (Michael Grubinger 2006): ... 2) ESP-Game (von Ahn and Dabbish 2004): ... 3) MIRFlickr (Huiskes and Lew 2008): ... 4) NUS-Wide (Chua et al. 2009):
Dataset Splits No The paper mentions using IAPR-TC12, ESP-Game, MIRFlickr, and NUS-Wide datasets and their total sizes, but it does not specify how these datasets were split into training, validation, or test sets for experiments.
Hardware Specification Yes We conduct all the experiments on the platform of Windows 10 with NVIDIA 1060 Graphics Processing Units (GPUs) and 32G memory size.
Software Dependencies Yes We implement our proposed DMIM method and deep clustering baselines with the public toolbox of Py Torch. Other traditional comparison baselines are conducted on Matlab 2016a.
Experiment Setup Yes In the proposed DMIM method, the multi-modal shared encoder is composed of two fully connected layers. Each fully connected layer is followed by a Batch Norm layer and a Re LU layer... The clustering layer and overclustering layer also adopt fully connected layer, in which the number of hidden nodes are set as |Y | and 10 |Y |, respectively. At the beginning of training process, all parameters in the model are initialized randomly. ... We use Adam (Kingma and Ba 2015) optimizer with the learning rate 5 10 5.