Hyperbolic Space with Hierarchical Margin Boosts Fine-Grained Learning from Coarse Labels

Authors: Shu-Lin Xu, Yifan Sun, Faen Zhang, Anqi Xu, Xiu-Shen Wei, Yi Yang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on five benchmark datasets showcase the effectiveness of our proposed method, yielding state-of-the-art results surpassing competing methods.
Researcher Affiliation Collaboration Shu-Lin Xu1, Yifan Sun2, Faen Zhang3, Anqi Xu4, Xiu-Shen Wei1 , Yi Yang5 1School of Computer Science and Engineering, and Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications, Southeast University 2Baidu Inc. 3AInnovation Technology Group Co., Ltd 4University of Toronto 5CCAI, College of Computer Science and Technology, Zhejiang University
Pseudocode No The paper describes the method using mathematical equations and figures, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code, such as a specific repository link, explicit code release statement, or code in supplementary materials.
Open Datasets Yes In experiments, we perform PE-HCM on five popular benchmark datasets, i.e., CIFAR-100 [16] and four sub-datasets {LIVING-17, NONLIVING-26, ENTITY-13, ENTITY-30} from BREEDS [23].
Dataset Splits Yes Table 1: Summaries of the benchmark datasets. ... # Train images ... # Test images ... We evaluate the performance using 5-way and all-way 1-shot settings during testing. The evaluation is conducted on 1000 random episodes, and we report the mean accuracy along with the 95% confidence interval.
Hardware Specification Yes We used the Adam optimizer to train the model on 4 Ge Force RTX 3090 Ti GPUs, and training a total of 200 epochs.
Software Dependencies No The paper mentions software components like ResNet and Adam optimizer, but does not provide specific version numbers for them or any other libraries needed to replicate the experiment.
Experiment Setup Yes For the hyperparameters, we set c = 0.001 in Eq. (1), α = 800 in Eq. (8), β = 0.999 in Eq. (10). We used the Adam optimizer to train the model... training a total of 200 epochs. For CIFAR-100 and BREEDS, the batch size were 1024 and 256, the initial learning rates were 5 10 4 and 1.25 10 4, the learning rates were reduced by 10 times when the epoch was 120 epoch and 160 epoch. In order to prevent the distance distribution from getting stuck in local optima, we reinitialize d1 = 0.134 and d2 = 0.5 at the 120-th epoch and 160-th epoch.