Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Rectifying Soft-Label Entangled Bias in Long-Tailed Dataset Distillation
Authors: Chenyang Jiang, Hang Zhao, Xinyu Zhang, Zhengcen Li, Qiben Shan, Shaocong Wu, Jingyong Su
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct a comprehensive evaluation of our dataset distillation method under long-tailed distribution settings. We begin by detailing the experimental setup, including datasets, evaluation metrics, and baseline methods for comparison. Specifically, we reproduce dataset distillation methods on long-tailed datasets and benchmark our approach against state-of-the-art baselines, demonstrating its superior performance under imbalanced conditions. We then perform ablation studies to assess the adaptiveness of our method, particularly in extreme long-tail scenarios, and evaluate its robustness under varying soft-label budgets. Finally, we provide explanatory analyses and visualizations to illustrate the effectiveness of the ADSA module. |
| Researcher Affiliation | Academia | 1Harbin Institute of Technology, Shenzhen 2Pengcheng Laboratory EMAIL EMAIL, EMAIL |
| Pseudocode | No | The paper describes the Adaptive Soft-Label Alignment module (ADSA) using textual descriptions and mathematical formulas (e.g., Equation 9 and 10), and a diagram in Figure 1(c). However, it does not present the steps in a structured pseudocode or algorithm block format. |
| Open Source Code | No | We will submit the core code for SRe2L/GVBSM/EDC training on Image Net LT as supplementary files. The complete codebase will be released on Git Hub upon paper acceptance. |
| Open Datasets | Yes | We evaluate our method on CIFAR10/100-LT [52], and Image Net-1k-LT (224 224) [54]. |
| Dataset Splits | Yes | We evaluate our method on CIFAR10/100-LT [52], and Image Net-1k-LT (224 224) [54]. The CIFAR-LT datasets are constructed using exponential long-tail distributions as in [42], with imbalance factor (IF) r = n0 n K 1 controlling class imbalance. Class sizes ni follow ni = n0 r i K 1 , and we test with r 10, 50, 100. The Image Net-LT follows the setup in [43]. |
| Hardware Specification | Yes | All experiments are conducted on an 8 RTX 4090 GPUs server, and the computational cost depends on the underlying baseline methods. |
| Software Dependencies | No | The paper describes model architectures like Res Net-18 and Efficient Net-B0 and mentions using PyTorch for models, but it does not provide specific version numbers for any software dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | Experimental Setup. We evaluate our method on CIFAR10/100-LT [52], and Image Net-1k-LT (224 224) [54]. The CIFAR-LT datasets are constructed using exponential long-tail distributions as in [42], with imbalance factor (IF) r = n0 n K 1 controlling class imbalance. Class sizes ni follow ni = n0 r i K 1 , and we test with r 10, 50, 100. The Image Net-LT follows the setup in [43]. We evaluate performance across varying IPC (images per class) settings and compare our method with state-of-the-art baselines, including LTDD [11], SRe2L [14], GVBSM [30], and EDC [16]. ... For all baselines, we adopt their default hyperparameter settings, optimizer and augmentation strategy except for special demonstration. ... Res Net-18 is used as the default evaluation backbone for SRe2L, GVBSM, and EDC, while Conv Net is adopted for LTDD. For a fair comparison, we follow the original training hyperparameters for all methods, and report the top-1 accuracy. ... We use a depth-3 Conv Net as both the evaluation and distillation backbone for MTT and DREAM, and adopt Res Net-18 as the evaluation backbone for SRe2L, GVBSM, and EDC. ... The optimization problem in Equation 10 is solved by performing a search over the range τ (0, 3). |