Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Towards Realistic Semi-supervised Medical Image Classification
Authors: Wenxue Li, Lie Ju, Feilong Tang, Peng Xia, Xinyu Xiong, Ming Hu, Lei Zhu, Zongyuan Ge
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on a variety of medical image datasets demonstrate the superior performance of our proposed method over state-of-the-art Closed-set and Openset SSL methods. |
| Researcher Affiliation | Academia | 1 Monash University 2 The Hong Kong University of Science and Technology (Guangzhou) 3 UNC-Chapel Hill 4 Sun Yat-sen University |
| Pseudocode | No | The paper describes the methodology in text and mathematical formulas but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of source code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We validate our proposed method on diverse datasets comprising multiple modalities, including dermatology, ophthalmology, and endoscopy. Dermatology. We adopt ISIC-2019 (Combalia et al. 2022)... Ophthalmology. We utilize APTOS-2019 (Karthick and Sohier 2019)... To create a more challenging and realistic scenario, we incorporate samples from the i AMD-Challenge (Fang et al. 2022) dataset... Endoscopy. We employ Hyper Kvasir (Borgli et al. 2020). |
| Dataset Splits | Yes | We consider labeled ratio γ {5%, 10%, 20%} for ISIC-2019, γ {10%, 20%} for APTOS-2019, and γ {1%, 2%} for Hyper Kvasir. To construct training data, we sample γ (%) samples from each known class as the labeled dataset, while the remaining samples are used to form the unlabeled data set. We establish the balanced validation set and test set with known classes for each dataset to ensure fair evaluation of the learning performance for every category. |
| Hardware Specification | Yes | All the experiments are implemented on two NVIDIA RTX4090 GPUs. |
| Software Dependencies | No | The paper mentions using ResNet-50 as the backbone architecture and the Adam optimizer, but does not provide specific version numbers for software libraries or dependencies like PyTorch, TensorFlow, or Python. |
| Experiment Setup | Yes | We train the model for 20,000 iterations. To update prototypes, we train the model with only the supervised training manner for the first 200 iterations... We adopt the Adam optimizer with a batch size of 64. The hyper-parameter λ which controls the ratio of unlabeled data in each batch, is set to 3. The learning rate is set to 0.0001 and adjusted using the cosine decay strategy. The temperature hyper-parameter τ in Eq. 7 is set to 0.07 empirically. |