Combating Representation Learning Disparity with Geometric Harmonization

Authors: Zhihan Zhou, Jiangchao Yao, Feng Hong, Ya Zhang, Bo Han, Yanfeng Wang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive results on a range of benchmark datasets show the effectiveness of GH with high tolerance to the distribution skewness. Our code is available at https://github.com/Media Brain-SJTU/Geometric-Harmonization.
Researcher Affiliation Collaboration 1Cooperative Medianet Innovation Center, Shanghai Jiao Tong University 2Shanghai AI Laboratory 3Hong Kong Baptist University
Pseudocode Yes Finally, Eq. (3) can be analytically solved by Sinkhorn-Knopp algorithm [14] (refer to Appendix D for Algorithm 1). In Algorithm 2 of Appendix D, we give the complete implementation of our method.
Open Source Code Yes Our code is available at https://github.com/Media Brain-SJTU/Geometric-Harmonization.
Open Datasets Yes We use Res Net-18 [23] as the backbone for small-scale dataset (CIFAR-100-LT [5]) and Res Net-50 [23] for large-scale datasets (Image Net-LT [44], Places-LT [44]).
Dataset Splits No The paper states 'linear probing on a balanced dataset is used for evaluation' and references prior work, but does not explicitly provide specific training, validation, and test split percentages or sample counts for its experiments.
Hardware Specification No The paper describes the experimental setup, including datasets, baselines, and hyper-parameters, but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using specific models like Res Net-18 and Res Net-50, and an SGD optimizer, but does not provide specific version numbers for any software dependencies (e.g., deep learning frameworks, libraries, or operating systems) used in the experiments.
Experiment Setup Yes For experiments on CIFAR-100-LT, we train model with the SGD optimizer, batch size 512, momentum 0.9 and weight decay factor 5 10 4 for 1000 epochs. For experiments on Image Net-LT and Places-LT, we only train for 500 epochs with the batch size 256 and weight decay factor 1 10 4. For learning rate schedule, we use the cosine annealing decay with the learning rate 0.5 1e 6 for all the baseline methods. As GH is combined with baselines, a proper warming-up of 500 epochs on CIFAR-100-LT and 400 epochs on Image Net-LT and Places-LT are applied. The cosine decay is set as 0.5 0.3, 0.3 1e 6 respectively. For hyper-parameters of GH, we provide a default setup across all the experiments: set the geometric dimension K as 100, w GH as 1 and the temperature γGH as 0.1. In the surrogate label allocation, we set the regularization coefficient λ as 20 and Sinkhorn iterations Es as 300.