Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Unified Embedding Alignment with Missing Views Inferring for Incomplete Multi-View Clustering

Authors: Jie Wen, Zheng Zhang, Yong Xu, Bob Zhang, Lunke Fei, Hong Liu5393-5400

AAAI 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results show that the proposed method can significantly improve the clustering performance in comparison with some state-of-the-art methods.
Researcher Affiliation	Academia	1Bio-Computing Research Center, Harbin Institute of Technology, Shenzhen, Shenzhen, China 2The University of Queensland, Australia 3Department of Computer and Information Science, University of Macau, Taipa, Macau, PR China 4School of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China 5Engineering Lab on Intelligent Perception for Internet of Things, Shenzhen Graduate School, Peking University, China EMAIL, EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode	Yes	Algorithm 1 : UEAF (solving (11))
Open Source Code	No	The paper does not provide a direct link to open-source code for the methodology.
Open Datasets	Yes	Dataset: (1) BUAA-visnir face dataset (BUAA) (Huang, Sun, and Wang 2012): Following the experimental settings in (Zhao, Liu, and Fu 2016), a subset of BUAA which is composed of 90 visual images and 90 near infrared images of the first 10 volunteers is chosen for comparison. (2) Handwritten digit dataset (Cai, Nie, and Huang 2013): The used handwritten digit dataset contains 2000 samples of 10 digits. The average pixels features with 240 dimensions and Fourier coefficient features with 76 dimensions are extracted as the two views for evaluation. (3) 3 Sources dataset: In our experiments, we evaluate different methods on the subset of 3 Sources dataset1, which is composed of 169 stories of six topical labels collected from the three well-known online news sources, i.e., BBC, Reuters, and the Guardian. Each source can be regarded as a view. (4) BBCSport: The exploited BBCSport dataset contains 116 samples from 5 classes. Each sample is represented by 4 views. The above used datasets are briefly summarized in Table 1.
Dataset Splits	Yes	For the BUAA and Handwritten datasets, we randomly select 10%, 30%, 50%, and 70% samples as the paired samples. For the remaining samples, half of them miss the first view, while the other half of the samples remove the second view. For the BBCSport and 3 sources datasets, we randomly remove 10%, 30%, and 50% instances of each view to form the incomplete multi-view data.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	We first fix parameters r = 3 and k = 7, and conduct some experiments on the BBCSport to analyze the sensitivity of ACC w.r.t. λ1, λ2 and λ3. From Fig. 3, we can see that UEAF can obtain encouraging results when they are located in the ranges of 101, 105 , 10 3, 101 , and 10 4, 101 , respectively. In the experiments, we exploit the grid search strategy to find the three optimal parameters (Wen et al. 2018b). Moreover, we show the ACC (%) w.r.t. r on the BBCsport and BUAA datasets in Fig. 4. The proposed method achieves a satisfactory performance with a small parameter r (less than 5) and we simply set r = 3 in all experiments.