Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Contrastive Multi-view Hyperbolic Hierarchical Clustering

Authors: Fangfei Lin, Bing Bai, Kun Bai, Yazhou Ren, Peng Zhao, Zenglin Xu

IJCAI 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on five real-world datasets demonstrate the effectiveness of the proposed method and its components.
Researcher Affiliation	Collaboration	Fangfei Lin1,2 , Bing Bai2 , Kun Bai2 , Yazhou Ren1 , Peng Zhao1 and Zenglin Xu3,4 1University of Electronic Science and Technology of China, Chengdu, China 2Tencent Security Big Data Lab, Tencent Inc., China 3Harbin Institute of Technology, Shenzhen, China 4Department of Network Intelligence, Peng Cheng National Lab, Shenzhen, China
Pseudocode	No	The paper describes the model architecture and loss functions, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using "corresponding open-source versions" for baseline methods (UFit and Hyp HC), but it does not provide concrete access or a statement about releasing the source code for its own proposed method (CMHHC).
Open Datasets	Yes	Datasets We conduct our experiments on the following five real-world multi-view datasets. MNIST-USPS [Peng et al., 2019] is a two-view dataset with 5000 hand-written digital (0-9) images. BDGP [Li et al., 2019b] contains 2500 images of Drosophila embryos divided into five categories with two extracted features. Caltech101-7 [Dueck and Frey, 2007] is established with 5 diverse feature descriptors... COIL-20 contains object images of 20 categories. Multi-Fashion [Xu et al., 2022] is a three-view dataset with 10,000 28 28 images of different fashionable designs...
Dataset Splits	No	The paper lists the datasets used and the total number of samples for each, but it does not specify any training, validation, or test dataset splits (e.g., percentages or absolute counts) that would be needed for reproduction.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments.
Software Dependencies	No	The paper states that the model is implemented with "Py Torch", but it does not provide a specific version number for PyTorch or any other software dependencies, which is required for reproducibility.
Experiment Setup	Yes	We first pretrain V autoencoders for 200 epochs and contrastive learning encoder for 10, 50, 50, 50 and 100 epochs on BDGP, MNIST-USPS, Caltech101-7, COIL-20, and Multi-Fashion respectively, and then finetune the whole multi-view representation learning process for 50 epochs, and finally train the hyperbolic hierarchical clustering loss for 50 epochs. The batch size is set to 256 for representation learning and 512 for hierarchical clustering, using the Adam and hyperbolicmatched Riemannian optimizer [Kochurov et al., 2020] respectively. The learning rate is set to 5e 4 for Adam, and a search over [5e 4, 1e 3] for Riemannian Adam of different datasets. We empirically set τ = 0.5 for all datasets, while τc = 5e 2 for BDGP, MNIST-USPS, and Multi Fashion and τc = 1e 1 for Caltech101-7 and COIL-20. We run the model 5 times and report the results with the lowest value of Lc. In addition, we create an adjacency graph with 50 Euclidean nearest neighbors to compute manifold similarities. We make the general rule that the kpos value equals N/2K and the kneg value equals N/K, making the selected tuples hard and reliable.