Exploring Balanced Feature Spaces for Representation Learning

Authors: Bingyi Kang, Yu Li, Sa Xie, Zehuan Yuan, Jiashi Feng

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct a series of studies on the performance of self-supervised contrastive learning and supervised learning methods over multiple datasets where training instance distributions vary from a balanced one to a long-tailed one.
Researcher Affiliation Collaboration National University of Singapore, 2Institute of Computing Technology, CAS, 3Byte Dance AI Lab
Pseudocode No The paper describes methods and processes in narrative text and equations, but does not include any pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/bingykang/Bal Feat.
Open Datasets Yes We construct six datasets from the long-tailed benchmark Image Net LT (Liu et al., 2019) (DLT) by varying its instance distribution {q1, . . . , q C} from a long-tailed one to a uniform one gradually, while keeping the total instance number similar. The generated datasets, denoted as DLT0, . . . , DLT8, DLT (which are increasingly more imbalanced), are used as Drep-train for representation learning in the following experiments. See appendix for their details. and We evaluate KCL and compare it with the above strong baselines on two large-scale benchmark datasets, Image Net-LT (Liu et al., 2019) and i Naturalist 2018 (i Natrualist, 2018).
Dataset Splits Yes We are using k = 6 for KCL throughout the paper, which is carefully tuned on the validation set of Image Net-LT, as shown in Fig. 5. and We use the (balanced) training and test sets of Image Net as Dtrain and Dtest to learn classifiers and evaluate their classification accuracy, following the above protocol. and Table 9 states Dtest: Image Net (val) for the Balancedness study.
Hardware Specification No The paper describes the model architecture (ResNet50 backbone) and software details, but does not provide specific hardware specifications like GPU or CPU models used for the experiments.
Software Dependencies No The paper mentions using 'Py Torch distributed training implementation' and 'Mo Co (He et al., 2020)', but it does not specify exact version numbers for these software dependencies or any other libraries.
Experiment Setup Yes epochs 90 200 batch size 256 256 256 learning rate 0.1 0.03 0.1 learning rate schedule cosine step cosine data augmentation default moco v1 default memory size 65536 65536 encoder momentum 0.999 0.999 feature dimension 128 128 softmax temperature 0.07 0.07 k 6