Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

GBRIP: Granular Ball Representation for Imbalanced Partial Label Learning

Authors: Jintao Huang, Yiu-ming Cheung, Chi-man Vong, Wenbin Qian

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on standard benchmarks demonstrate that GBRIP outperforms existing state-of-the-art methods, offering a robust solution to the challenges of imbalanced PLL. We evaluated our method on two long-tailed datasets: CIFAR10-LT and CIFAR100-LT.
Researcher Affiliation	Academia	1Department of Computer Science, Hong Kong Baptist University, Hong Kong SAR, China 2Department of Computer and Information Science, University of Macau, Macau SAR, China 3School of Software, Jiangxi Agricultural University, Nanchang, China
Pseudocode	No	The paper describes the GBRIP method and its components (CGR and MCL) using mathematical formulations and descriptive text, but it does not include a clearly labeled pseudocode or algorithm block.
Open Source Code	No	The paper does not explicitly state that source code for the described methodology is publicly available, nor does it provide any links to a code repository.
Open Datasets	Yes	We evaluated our method on two long-tailed datasets: CIFAR10-LT and CIFAR100-LT. This section evaluates GBRIP s performance on four classical real-world PLL datasets: Lost, Bird Song(Bird. S), Soccer Player(Soccer. P), and Yahoo! News (Yahoo. N). We conducted experiments on the large-scale SUN397 dataset, which contains 108,754 RGB images across 397 scene classes.
Dataset Splits	Yes	The training images were randomly removed class-wise to create a predefined imbalance ratio γ = nj n L , where nj represents the number of images in the j-th class. To generate partially labeled datasets, we manually flipped negative labels (ˆy = y) to false-positive labels with a probability ψ = P(ˆy Y \|ˆy = y)... We selected γ = {50, 100, 200}, ψ {0.3, 0.5} for CIFAR10-LT and γ = {10, 20, 50}, ψ {0.05, 0.1} for CIFAR100-LT. We set the batch size to 128 for these experiments and held out 50 samples per class for testing. This setup resulted in a training set with an imbalanced ratio of approximately 46 (2311/50). All experiments were conducted three times with different random seeds.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models or CPU specifications.
Software Dependencies	No	The paper mentions using an '18-layer Res Net as the feature backbone' and 'the standard SGD optimizer' but does not provide specific version numbers for software libraries, frameworks, or programming languages used (e.g., Python, PyTorch, TensorFlow).
Experiment Setup	Yes	We utilized an 18-layer Res Net as the feature backbone for our experiments. The model was trained for 1000 epochs using the standard SGD optimizer with a momentum of 0.9. The initial learning rate was set to 0.01 and decayed using a cosine learning rate schedule. The batch size was fixed at 256... For our method, GBRIP, the hyper-parameters were set as follows: λ1 = 0.5, λ2 = 0.5 and λ3 = 0.1. The moving average parameter µ for the class prior estimate was set to 0.1/0.05 in the first phase and fixed at 0.01 afterward. For class-reliable sample selection, the parameter ρ was linearly increased from 0.2 to 0.5/0.6 over the first 50 epochs.