Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Novel Class Discovery for Long-tailed Recognition

Authors: Chuyu Zhang, Ruijie Xu, Xuming He

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform extensive experiments on CIFAR100, Image Net100, Herbarium19 and large-scale i Naturalist18 datasets, and the results demonstrate the superiority of our method. Our code is available at https://github.com/kleinzcy/NCDLR.
Researcher Affiliation	Academia	Chuyu Zhang EMAIL Shanghai Tech University, Shanghai, China Lingang Laboratory, Shanghai, China Ruijie Xu EMAIL Shanghai Tech University, Shanghai, China Xuming He EMAIL Shanghai Tech University, Shanghai, China Shanghai Engineering Research Center of Intelligent Vision and Imaging, Shanghai, China
Pseudocode	Yes	Algorithm 1: Sinkhorn-Knopp Based Pseudo Labeling Algorithm. Algorithm 2: Adaptive Self-labeling Algorithm. Algorithm 3: The algorithm of estimating the number of novel categories.
Open Source Code	Yes	Our code is available at https://github.com/kleinzcy/NCDLR.
Open Datasets	Yes	We conduct extensive experiments on two constructed long-tailed datasets, CIFAR100 and Imagenet100, as well as two challenging natural long-tailed datasets, Herbarium19 and i Naturalist18. ... CIFAR100 (Krizhevsky et al., 2009) and Image Net100 (Deng et al., 2009), and two real-world medium/large-scale long-tailed image classification datasets, Herbarium19 (Tan et al., 2019) and i Naturalist18 (Van Horn et al., 2018).
Dataset Splits	Yes	For each dataset, we randomly divide all classes into 50% known classes and 50% novel classes. For testing, we report the NCD performance on the official validation sets of each dataset, except for CIFAR100, where we use its official test set. The details of our datasets are shown in Tab. 1. Test set 10k 10k 5.0k 5.0k 2.8k 3.0k 6.0k
Hardware Specification	No	No specific hardware details (like GPU models, CPU models, or memory) were provided in the paper. The paper mentions using a ViT-B-16 backbone network but not the hardware it ran on.
Software Dependencies	No	The paper mentions using 'AdamW with momentum as the optimizer' and the 'Sinkhorn-Knopp algorithm' but does not provide specific version numbers for any programming languages, libraries, or frameworks used (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	We train 50 epochs on CIFAR100 and Image Net100, 70 epochs on Herbarium and i Naturalist18. We use AdamW with momentum as the optimizer with linear warm-up and cosine annealing (lrbase = 1e-3, lrmin = 1e-4, and weight decay 5e-4). We set α = 1, and select γ = 500 by validation. For all the experiments, we set the batch size to 128 and the iteration step L to 10. For the Sinkhorn-Knopp algorithm, we adopt all the hyperparameters from (Caron et al., 2020), e.g. niter = 3 and ϵ = 0.05.