Combating Representation Learning Disparity with Geometric Harmonization
Authors: Zhihan Zhou, Jiangchao Yao, Feng Hong, Ya Zhang, Bo Han, Yanfeng Wang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive results on a range of benchmark datasets show the effectiveness of GH with high tolerance to the distribution skewness. Our code is available at https://github.com/Media Brain-SJTU/Geometric-Harmonization. |
| Researcher Affiliation | Collaboration | 1Cooperative Medianet Innovation Center, Shanghai Jiao Tong University 2Shanghai AI Laboratory 3Hong Kong Baptist University |
| Pseudocode | Yes | Finally, Eq. (3) can be analytically solved by Sinkhorn-Knopp algorithm [14] (refer to Appendix D for Algorithm 1). In Algorithm 2 of Appendix D, we give the complete implementation of our method. |
| Open Source Code | Yes | Our code is available at https://github.com/Media Brain-SJTU/Geometric-Harmonization. |
| Open Datasets | Yes | We use Res Net-18 [23] as the backbone for small-scale dataset (CIFAR-100-LT [5]) and Res Net-50 [23] for large-scale datasets (Image Net-LT [44], Places-LT [44]). |
| Dataset Splits | No | The paper states 'linear probing on a balanced dataset is used for evaluation' and references prior work, but does not explicitly provide specific training, validation, and test split percentages or sample counts for its experiments. |
| Hardware Specification | No | The paper describes the experimental setup, including datasets, baselines, and hyper-parameters, but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using specific models like Res Net-18 and Res Net-50, and an SGD optimizer, but does not provide specific version numbers for any software dependencies (e.g., deep learning frameworks, libraries, or operating systems) used in the experiments. |
| Experiment Setup | Yes | For experiments on CIFAR-100-LT, we train model with the SGD optimizer, batch size 512, momentum 0.9 and weight decay factor 5 10 4 for 1000 epochs. For experiments on Image Net-LT and Places-LT, we only train for 500 epochs with the batch size 256 and weight decay factor 1 10 4. For learning rate schedule, we use the cosine annealing decay with the learning rate 0.5 1e 6 for all the baseline methods. As GH is combined with baselines, a proper warming-up of 500 epochs on CIFAR-100-LT and 400 epochs on Image Net-LT and Places-LT are applied. The cosine decay is set as 0.5 0.3, 0.3 1e 6 respectively. For hyper-parameters of GH, we provide a default setup across all the experiments: set the geometric dimension K as 100, w GH as 1 and the temperature γGH as 0.1. In the surrogate label allocation, we set the regularization coefficient λ as 20 and Sinkhorn iterations Es as 300. |