Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach
Authors: Tri Nguyen, Shahana Ibrahim, Xiao Fu
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our method over a series of DCC tasks and observe that the proposed approach significantly improves the performance over existing paradigms, especially when annotation noise exists. Our finding shows the significance of identifiability in DCC, echoing observations made in similar semi-supervised/unsupervised problems, e.g., (Arora et al., 2013; Kumar et al., 2013; Anandkumar et al., 2014; Zhang et al., 2014). We also evaluate the algorithms using real data collected through the Amazon Mechanical Turk (AMT) platform. The code is published at github.com/ductri/Vol Max DCC. Datasets. We use STL-10 (Coates et al., 2011), Image Net10 (Chang et al., 2017a), and CIFAR-10 (Krizhevsky et al., 2009). |
| Researcher Affiliation | Academia | 1School of Electrical Engineering and Computer Science, Oregon State University, OR, USA. |
| Pseudocode | No | The paper describes the proposed method and algorithm implementation in text, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures. |
| Open Source Code | Yes | The code is published at github.com/ductri/Vol Max DCC. |
| Open Datasets | Yes | Datasets. We use STL-10 (Coates et al., 2011), Image Net10 (Chang et al., 2017a), and CIFAR-10 (Krizhevsky et al., 2009). |
| Dataset Splits | Yes | We use a validation set for the baselines whenever proper for parameter tuning and algorithm stopping. The sizes of the validation sets are Nvalid = 1000 for STL-10 and Image Net-10 and Nvalid = 5000 for CIFAR-10. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory specifications, or cloud resources) used for running the experiments. |
| Software Dependencies | No | The paper mentions using stochastic gradient descent and references a pre-training method, but it does not specify software dependencies like programming language versions, library versions (e.g., PyTorch, TensorFlow), or CUDA versions. |
| Experiment Setup | Yes | In our implementation, we use stochastic gradient descent with a batch size of 128. We set the learning rate for B and θ to be 0.1 and 0.5, respectively. The initialization of θ is chosen randomly following uniform distributions with parameters depending on output dimension of each layer. To initialize B , we make the diagonal elements to be 1 and the other elements 1. |