Unified Embedding Alignment with Missing Views Inferring for Incomplete Multi-View Clustering

Authors: Jie Wen, Zheng Zhang, Yong Xu, Bob Zhang, Lunke Fei, Hong Liu5393-5400

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results show that the proposed method can significantly improve the clustering performance in comparison with some state-of-the-art methods.
Researcher Affiliation Academia 1Bio-Computing Research Center, Harbin Institute of Technology, Shenzhen, Shenzhen, China 2The University of Queensland, Australia 3Department of Computer and Information Science, University of Macau, Taipa, Macau, PR China 4School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China 5Engineering Lab on Intelligent Perception for Internet of Things, Shenzhen Graduate School, Peking University, China jiewen_pr@126.com, darrenzz219@gmail.com, yongxu@ymail.com, bobzhang@umac.mo, flksxm@126.com, hongliu@pku.edu.cn
Pseudocode Yes Algorithm 1 : UEAF (solving (11))
Open Source Code No The paper does not provide a direct link to open-source code for the methodology.
Open Datasets Yes Dataset: (1) BUAA-visnir face dataset (BUAA) (Huang, Sun, and Wang 2012): Following the experimental settings in (Zhao, Liu, and Fu 2016), a subset of BUAA which is composed of 90 visual images and 90 near infrared images of the first 10 volunteers is chosen for comparison. (2) Handwritten digit dataset (Cai, Nie, and Huang 2013): The used handwritten digit dataset contains 2000 samples of 10 digits. The average pixels features with 240 dimensions and Fourier coefficient features with 76 dimensions are extracted as the two views for evaluation. (3) 3 Sources dataset: In our experiments, we evaluate different methods on the subset of 3 Sources dataset1, which is composed of 169 stories of six topical labels collected from the three well-known online news sources, i.e., BBC, Reuters, and the Guardian. Each source can be regarded as a view. (4) BBCSport: The exploited BBCSport dataset contains 116 samples from 5 classes. Each sample is represented by 4 views. The above used datasets are briefly summarized in Table 1.
Dataset Splits Yes For the BUAA and Handwritten datasets, we randomly select 10%, 30%, 50%, and 70% samples as the paired samples. For the remaining samples, half of them miss the first view, while the other half of the samples remove the second view. For the BBCSport and 3 sources datasets, we randomly remove 10%, 30%, and 50% instances of each view to form the incomplete multi-view data.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes We first fix parameters r = 3 and k = 7, and conduct some experiments on the BBCSport to analyze the sensitivity of ACC w.r.t. λ1, λ2 and λ3. From Fig. 3, we can see that UEAF can obtain encouraging results when they are located in the ranges of 101, 105 , 10 3, 101 , and 10 4, 101 , respectively. In the experiments, we exploit the grid search strategy to find the three optimal parameters (Wen et al. 2018b). Moreover, we show the ACC (%) w.r.t. r on the BBCsport and BUAA datasets in Fig. 4. The proposed method achieves a satisfactory performance with a small parameter r (less than 5) and we simply set r = 3 in all experiments.