Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Consistent Multi-Class Classification from Multiple Unlabeled Datasets
Authors: Zixi Wei, Senlin Shu, Yuzhou Cao, Hongxin Wei, Bo An, Lei Feng
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on multiple benchmark datasets across various settings demonstrate the superiority of our proposed methods. |
| Researcher Affiliation | Collaboration | 1Chongqing University 2Nanyang Technological University 3Southern University of Science and Technology 4Skywork AI |
| Pseudocode | Yes | The pseudo-code of the CCM is provided in Algorithm 1. The pseudo-code of RCM is shown in Algorithm 2. |
| Open Source Code | No | The paper does not contain an explicit statement about releasing open-source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We use 5 popular benchmark datasets including MNIST (Le Cun et al., 1998), Kuzushiji MNIST (Clanuwat et al., 2018), Fasion-MNIST (Xiao et al., 2017), CIFAR10 (Krizhevsky et al., 2009) and SVHN (Netzer et al., 2011). |
| Dataset Splits | No | The paper mentions "training data points" and the total number of training data points being fixed, but it does not specify explicit percentages or counts for training, validation, and test splits needed to reproduce the experiment. |
| Hardware Specification | Yes | We used Py Torch (Paszke et al., 2019) to implement our experiments and conducted the experiments on NVIDIA 3090 GPUs. |
| Software Dependencies | No | In the experiments, Adam (Kingma & Ba, 2015) was used for optimization... We used Py Torch (Paszke et al., 2019) to implement our experiments. The paper mentions software like Adam and PyTorch but does not provide specific version numbers for these or any other key software components, which is required for a reproducible description of ancillary software. |
| Experiment Setup | Yes | We trained the classification model for 100 epochs on all datasets. The learning rate was chosen from {10 5, 10 4, 10 3, 10 2, 10 1} and the batch size was chosen from 128 and 256. The weight decay was set as 10 5. |