Robust Visual Recognition with Class-Imbalanced Open-World Noisy Data

Authors: Na Zhao, Gim Hee Lee

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on several benchmark datasets including synthetic and real-world noisy datasets demonstrate the superior performance robustness of our method over existing methods.
Researcher Affiliation Academia Na Zhao1*, Gim Hee Lee2 1Singapore University of Technology and Design 2National University of Singapore
Pseudocode No The paper describes the methods in narrative text and mathematical equations, but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/Na-Z/LIOND.
Open Datasets Yes We evaluate our proposed method on three datasets, including CIFAR-10 and CIFAR-100 (Krizhevsky, Hinton et al. 2009) with controlled noise and class imbalance, and Web Vision (Li et al. 2017) that is a real-world class-imbalanced dataset with open-world noise.
Dataset Splits No The paper mentions 'training data' and 'validation sets' (e.g., in Table 4 for Web Vision), but does not explicitly provide specific percentages, sample counts, or detailed methodology for how the train/validation/test splits were created or used for reproduction across all experiments.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It only mentions general computing environments without specific hardware specifications.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes We train the model using SGD optimizer with momentum 0.9 and weight decay 5e-4. We set the batch size as 128 and the initial learning rate as 0.02 with a cosine decay schedule. The model is trained for 300 epochs with a warmup period using Lbsce. The warmup period is set to 10 and 30 epochs for CIFAR-10 and CIFAR100, respectively. We set the hyper-parameters as α = 0.05, β = 3, γ = 0.5, K = 30, ϵl = 1, ϵh = 1, τ = 0.3, and ω = 0.99. To align with Proto Mix and NGC, we adopt Inception-Res Net V2 as the feature encoder. We train the model using SGD optimizer with momentum 0.9 and weight decay 1e-4. We set the batch size as 32 and the initial learning rate as 0.04 with a cosine decay schedule. The model is trained for 80 epochs with a 15-epoch warmup period using Lbsce. The configuration of hyper-parameters is the same as that for CIFAR datasets, except for K = 50 and ϵl = 0.1.