reproducibilityindex.ai

RPSC: Robust Pseudo-Labeling for Semantic Clustering

Authors: Sihang Liu, Wenming Cao, Ruigang Fu, Kaixiang Yang, Zhiwen Yu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that RPSC outperforms 18 competitive clustering algorithms significantly on six challenging image benchmarks.
Researcher Affiliation	Academia	Sihang Liu1, Wenming Cao2, Ruigang Fu3, Kaixiang Yang1*, Zhiwen Yu1,4 1School of Computer Science and Engineering, South China University of Technology, Guangzhou, China 2School of Mathematics and Statistics, Chongqing Jiaotong University, Chongqing, China 3College of Electronic Science and Technology, National University of Defense Technology, Changsha, China 4Peng Cheng Laboratory, Shenzhen, China
Pseudocode	No	The paper describes the RPSC framework in detail but does not provide any formal pseudocode blocks or algorithms.
Open Source Code	No	The paper does not include any explicit statements about releasing source code or provide a link to a code repository for the methodology described.
Open Datasets	Yes	In this section, we investigate effectiveness of the proposed RPSC by conducting comparative experiments on six public image data sets: STL-10, CIFAR-10, CIFAR-100-20, Image Net-10, Image Net-Dog, and Tiny-Image Net.
Dataset Splits	No	While Table 1 provides 'Training' and 'Testing' sizes for some datasets, the paper does not specify the explicit splits (e.g., percentages, counts for training, validation, and test sets, or predefined split methodologies) needed to reproduce the experiment's data partitioning.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions several software components and frameworks (e.g., Moco-v2, BYOL, SCAN, Free Match, Cutout, Rand Augment) but does not provide specific version numbers for any of them or for general programming languages/libraries used.
Experiment Setup	Yes	We set M to 1,000 for STL-10, CIFAR-10, and Image Net10 that contain 10 clusters, 1,500 for Image Net-Dog with 15 clusters, 2,000 for CIFAR-10020 with 20 clusters, and 5,000 for Tiny-Image Net with 200 clusters, which we used the same settings as SPICE (Niu, Shan, and Wang 2022). We set the number of projection heads to 10 and set the confidence ratio γ to 0.5 based on experience and temperature parameters τ = 0.5. Ne = 100 and τt = 0.95 will be set when selecting reliable semantic pseudo-labels in the RPSC-Semi. Other parameters settings in RPSC-Semi are the same as Free Match.