Self-Supervised Learning Disentangled Group Representation as Feature

Authors: Tan Wang, Zhongqi Yue, Jianqiang Huang, Qianru Sun, Hanwang Zhang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We prove that IP-IRM converges to a fully disentangled representation and show its effectiveness on various benchmarks. Codes are available at https://github.com/Wangt-CN/IP-IRM. In Section 5, we show promising experimental results on various feature disentanglement and SSL benchmarks.
Researcher Affiliation Collaboration 1Nanyang Technological University 2Singapore Management University 3Damo Academy, Alibaba Group
Pseudocode Yes The paper includes a structured section titled '3 IP-IRM Algorithm' which outlines the steps, inputs, and outputs of the proposed algorithm, serving as pseudocode.
Open Source Code Yes Codes are available at https://github.com/Wangt-CN/IP-IRM.
Open Datasets Yes We used two datasets. CMNIST [2] has 60,000 digit images... Shapes3D [50] contains 480,000 images... Cifar100 [54] contains 60,000 images... STL10 [22] has 113,000 images... Image Net ILSVRC2012 [25]... NICO [37]... FGVC Aircraft (Aircraft) [60], Caltech-101 (Caltech) [31], Stanford Cars (Cars) [93], Cifar10 [53], Cifar100 [53], DTD [21], Oxford 102 Flowers (Flowers) [62], Food-101 (Food) [6], Oxford-IIIT Pets (Pets) [67] and SUN397 (SUN) [91].
Dataset Splits No The paper does not explicitly provide details about training/validation/test dataset splits by percentage or sample counts for the datasets used. While standard benchmarks like CIFAR100, STL10, and ImageNet are used, their specific splits (including validation) are not explicitly stated.
Hardware Specification No The paper mentions 'due to limited computing resources' and points to the Appendix for 'full implementation details' and 'more training details', but no specific hardware (e.g., GPU/CPU models, memory) is explicitly mentioned in the main text.
Software Dependencies No No specific software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions) are explicitly mentioned in the main text. The paper refers to the Appendix for 'full implementation details'.
Experiment Setup Yes We learned the representations for 400 and 1000 epochs. We evaluated both linear and k-NN (k = 200) accuracies for the downstream classification task. All the representations were trained for 200 epochs. We trained the models for 700 epochs and updated P every 50 epochs. We used different values of λ1 and λ2 (in Eq. (2) and Eq. (3)), by training for 200 epochs.