Diffusion-based Missing-view Generation With the Application on Incomplete Multi-view Clustering
Authors: Jie Wen, Shijie Deng, Waikeung Wong, Guoqing Chao, Chao Huang, Lunke Fei, Yong Xu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results show that the proposed DMVG can not only accurately predict missing views, but also further enhance the subsequent clustering performance in comparison with several state-of-the-art incomplete multi-view clustering methods. |
| Researcher Affiliation | Academia | 1Shenzhen Key Laboratory of Visual Object Detection and Recognition, Harbin Institute of Technology, Shenzhen, China 2School of Fashion and Textiles, The Hong Kong Polytechnic University, Hong Kong 3Laboratory for Artificial Intelligence in Design, Hong Kong 4School of Computer Science and Technology, Harbin Institute of Technology, Weihai, China 5School of Cyber Science and Technology, Shenzhen Campus of Sun Yat-sen University, Shenzhen, China 6School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China. |
| Pseudocode | Yes | The paper contains Algorithm 1 'Training of DMVG' and Algorithm 2 'Generation of DMVG'. |
| Open Source Code | Yes | The code of our DMVG is released at: https://github.com/ckghostwj/DMVG/tree/main. |
| Open Datasets | Yes | Datasets. Caltech7 (Li et al., 2022a; 2015) consists of 1474 images of 7 kinds of objects, each with 6 feature views produced by Gabor, Wavelet moments, CENTRIST, Histogram of oriented gradients, GIST, and Local binary pattern feature extractors, respectively. NH-face (Cao et al., 2015) is derived from the film Notting Hill and contains 4660 facial images with 3 views represented by gray, gabor, and LBP features. Multi-Modal Celeb A-HQ (Xia et al., 2021) contains 30000 pairs of human faces, including RGB and sketch images. Carl (Espinosa-Dur o et al., 2013) is a multi-view facial dataset with 2460 pairs, each including gray, infrared, and thermal images. Also mentioned in Appendix A: BBCSports (Greene & Cunningham, 2006), Handwritten (Newman, 2007), and Animal (Zhang et al., 2019). |
| Dataset Splits | No | The paper mentions 'training epochs' ('the learning rate linearly decreasing to zero over 1000 training epochs') and describes how training data is selected. However, it does not explicitly provide details about training/validation/test dataset splits, such as percentages or sample counts for validation, nor does it describe a validation process. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using the 'Adam optimizer' but does not specify any software dependencies with version numbers (e.g., Python, PyTorch/TensorFlow, CUDA, or other libraries). |
| Experiment Setup | Yes | In DMVG, we set T = 1000 and interpolate αt in the range of [1 10 6, 1 2 10 2]. The batch size is adapted based on the dataset size, with a learning rate as 10 4. We utilize the Adam optimizer, with the learning rate linearly decreasing to zero over 1000 training epochs. As for the initialization, we employ a straightforward random initialization approach. |