A Generalized Unbiased Risk Estimator for Learning with Augmented Classes

Authors: Senlin Shu, Shuo He, Haobo Wang, Hongxin Wei, Tao Xiang, Lei Feng

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct extensive experiments to evaluate the performance of our proposed method on various datasets using different models.
Researcher Affiliation Academia 1College of Computer Science, Chongqing University, China 2School of Computer Science and Engineering, University of Electronic Science and Technology of China, China 3College of Computer Science and Technology, Zhejiang University, China 4School of Computer Science and Engineering, Nanyang Technological University, Singapore
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not include any explicit statements about releasing source code or provide links to a code repository.
Open Datasets Yes We use six regular-scale datasets downloaded from the UCI Machine Learning Repository1 (Dua, Graff et al. 2017)... We also use four widely large-scale benchmark datasets, including MNIST2 (Le Cun et al. 1998), Fashion-MNIST3 (Xiao, Rasul, and Vollgraf 2017), Kuzushiji-MNIST4 (Clanuwat et al. 2018), and SVHN5 (Netzer et al. 2011).
Dataset Splits Yes For each regular-scale dataset, half of the classes are selected as augmented classes and the remaining classes are considered as known classes. Besides, the number of labeled, unlabeled, and test examples set to 500, 1000, and 1000, respectively. For large-scale datasets, we select six classes as known classes and other classes are regarded as augmented classes. For MNIST, Fashion-MNIST, and Kuzushiji-MNIST, the number of the labeled, unlabeled, and test examples is set to 24000 (4000 per known class), 10000 (1000 per class), and 1000 (100 per class), respectively. For SVHN, the number of the labeled, unlabeled, and test examples is set to 24000 (4000 per known class), 25000 (2500 per class), and 1000 (100 per class), respectively.
Hardware Specification Yes All the experiments are con-ducted on Ge Force GTX 3090 GPUs.
Software Dependencies No The paper mentions using specific optimization methods and loss functions with citations, but does not provide specific version numbers for any software libraries, frameworks, or programming languages used (e.g., PyTorch, TensorFlow, Python version).
Experiment Setup Yes For our proposed method, we utilize the generalized cross entropy (GCE) loss... We use the Adam optimization method... with the number of training epochs set to 1500 on regular-scale datasets and 200 on large-scale datasets respectively. We set the mini-batch size to 500 on large-scale datasets and use the full batch size on regular-scale datasets. For regular-scale datasets, learning rate and weight decay are selected in {10 2, 10 3, 10 4}. For large-scale datasets, learning rate and weight decay are selected in {10 3, 10 4, 10 5}. For our method, t and λ are selected in {1, 2, 3} and {0.2, 0.4, . . . , 1.8, 2}, respectively.