Weakly Supervised Clustering by Exploiting Unique Class Count

Authors: Mustafa Umit Oner, Hwee Kuan Lee, Wing-Kin Sung

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have constructed a neural network based ucc classifier and experimentally shown that the clustering performance of our framework with our weakly supervised ucc classifier is comparable to that of fully supervised learning models where labels for all instances are known. Furthermore, we have tested the applicability of our framework to a real world task of semantic segmentation of breast cancer metastases in histological lymph node sections and shown that the performance of our weakly supervised framework is comparable to the performance of a fully supervised Unet model.
Researcher Affiliation Academia 1School of Computing, National University of Singapore, Singapore 117417, 2A*STAR Bioinformatics Institute, Singapore 138671, 3Image and Pervasive Access Lab (IPAL), CNRS UMI 2955, Singapore 138632, 4Singapore Eye Research Institute, Singapore 169856, 5A*STAR Genome Institute of Singapore, Singapore 138672
Pseudocode No The paper describes the model architecture and training process using textual descriptions and mathematical formulas, but it does not include any explicit pseudocode blocks or algorithm listings.
Open Source Code Yes Code and trained models: http://bit.ly/uniqueclasscount
Open Datasets Yes This section analyzes the performances of our UCC models and fully supervised models in terms of our eventual objective of unsupervised instance clustering on MNIST (10 clusters) (Le Cun et al., 1998), CIFAR10 (10 clusters) and CIFAR100 (20 clusters) datasets (Krizhevsky & Hinton, 2009). ... We have used 512 512 image crops from publicly available CAMELYON dataset (Litjens et al., 2018)
Dataset Splits Yes For MNIST, we randomly splitted 10,000 images from training set as validation set, so we had 50,000, 10,000 and 10,000 images in our training Xmnist,tr, validation Xmnist,val and test sets Xmnist,test, respectively. ... Similar to MNIST dataset, we randomly splitted 10,000 images from the training set as validation set. Hence, we had 40,000, 10,000 and 10,000 images in our training Xcifar10,tr, validation Xcifar10,val and testing Xcifar10,test sets for CIFAR10, respectively. ... Similar to other datasets, we randomly splitted 10,000 images from the training set as validation set. Hence, we had 40,000, 10,000 and 10,000 images in our training Xcifar100,tr, validation Xcifar100,val and testing Xcifar100,test sets for CIFAR10, respectively.
Hardware Specification No The paper does not provide specific details on the hardware used for experiments, such as GPU models, CPU types, or memory configurations.
Software Dependencies No The paper describes the use of neural networks and deep learning models, implying the use of associated software frameworks (e.g., TensorFlow, PyTorch). However, it does not explicitly list any software dependencies with their specific version numbers (e.g., 'Python 3.7', 'PyTorch 1.9').
Experiment Setup Yes For KDE module, we have tried parameters of 11 bins, 21 bins, σ = 0.1 and σ = 0.01. Best results were obtained with 11 bins and σ = 0.1. Similarly, we have tested different number of features at the output of θfeature module and we decided to use 10 features for MNIST and CIFAR10 datasets and 16 features for CIFAR100 dataset based on the clustering performance and computation burden.