Dual Mutual Information Constraints for Discriminative Clustering
Authors: Hongyu Li, Lefei Zhang, Kehua Su
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on five benchmark datasets show that our proposed approach outperforms most other clustering algorithms. |
| Researcher Affiliation | Academia | 1School of Computer Science, Wuhan University, Wuhan, 430072, P. R. China 2Hubei Luojia Laboratory, Wuhan 430072, P. R. China {hongyuli, zhanglefei, skh}@whu.edu.cn |
| Pseudocode | Yes | Algorithm 1: DMICC |
| Open Source Code | Yes | The code is available at https://github.com/Li-Hyn/DMICC. |
| Open Datasets | Yes | We evaluate the performance of our DMICC approach on five publicly available datasets, including CIFAR-10/100 (Krizhevsky, Hinton et al. 2009), STL-10 (Coates, Ng, and Lee 2011), Image Net-10 (Deng et al. 2009) and Image Net Dogs (Deng et al. 2009). The number of images, clusters, and image size are presented in Table 1. |
| Dataset Splits | No | The paper lists the datasets used and mentions the total number of epochs and batch size, but it does not specify the train/validation/test splits, such as percentages or specific sample counts for each split. It relies on standard dataset splits without explicitly detailing them. |
| Hardware Specification | Yes | All experiments are conducted on an NVIDIA RTX 2080Ti GPU. |
| Software Dependencies | No | The paper mentions using a 'Res Net' structure and 'stochastic gradient descent optimizer' but does not specify software dependencies with version numbers (e.g., Python version, specific deep learning framework like PyTorch or TensorFlow with their versions, CUDA version). |
| Experiment Setup | Yes | We set the dimensionality of the latent feature vector d to 128, and the temperature coefficient τ in the instance discrimination to 2. We use a stochastic gradient descent optimizer at momentum β = 0.9. The learning rate was initialized to 0.05 and then gradually decreased after the first 600 epochs by a factor of 0.5 per 350 epochs. The weight decay 5e-4 is used for all the datasets. The total number of epochs is set to 5000 and the batch size is set to 128. For some large datasets, we try to extend the epochs to 7000 to ensure convergence. The setting of balance parameters λ1 and λ2 are as follows. In CIFAR-10/100, we set the balance parameters λ1 as 1e-2 and λ2 as 1e-4; in STL-10, we set λ1 as 1e-5 and λ2 as 1e-6; in Image Net-10/Dogs, we set λ1 as 1e-5 and λ2 as 1e-7, respectively. |