RPSC: Robust Pseudo-Labeling for Semantic Clustering

Authors: Sihang Liu, Wenming Cao, Ruigang Fu, Kaixiang Yang, Zhiwen Yu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that RPSC outperforms 18 competitive clustering algorithms significantly on six challenging image benchmarks.
Researcher Affiliation Academia Sihang Liu1, Wenming Cao2, Ruigang Fu3, Kaixiang Yang1*, Zhiwen Yu1,4 1School of Computer Science and Engineering, South China University of Technology, Guangzhou, China 2School of Mathematics and Statistics, Chongqing Jiaotong University, Chongqing, China 3College of Electronic Science and Technology, National University of Defense Technology, Changsha, China 4Peng Cheng Laboratory, Shenzhen, China
Pseudocode No The paper describes the RPSC framework in detail but does not provide any formal pseudocode blocks or algorithms.
Open Source Code No The paper does not include any explicit statements about releasing source code or provide a link to a code repository for the methodology described.
Open Datasets Yes In this section, we investigate effectiveness of the proposed RPSC by conducting comparative experiments on six public image data sets: STL-10, CIFAR-10, CIFAR-100-20, Image Net-10, Image Net-Dog, and Tiny-Image Net.
Dataset Splits No While Table 1 provides 'Training' and 'Testing' sizes for some datasets, the paper does not specify the explicit splits (e.g., percentages, counts for training, validation, and test sets, or predefined split methodologies) needed to reproduce the experiment's data partitioning.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions several software components and frameworks (e.g., Moco-v2, BYOL, SCAN, Free Match, Cutout, Rand Augment) but does not provide specific version numbers for any of them or for general programming languages/libraries used.
Experiment Setup Yes We set M to 1,000 for STL-10, CIFAR-10, and Image Net10 that contain 10 clusters, 1,500 for Image Net-Dog with 15 clusters, 2,000 for CIFAR-10020 with 20 clusters, and 5,000 for Tiny-Image Net with 200 clusters, which we used the same settings as SPICE (Niu, Shan, and Wang 2022). We set the number of projection heads to 10 and set the confidence ratio γ to 0.5 based on experience and temperature parameters τ = 0.5. Ne = 100 and τt = 0.95 will be set when selecting reliable semantic pseudo-labels in the RPSC-Semi. Other parameters settings in RPSC-Semi are the same as Free Match.