Identifiable Contrastive Learning with Automatic Feature Importance Discovery
Authors: Qi Zhang, Yifei Wang, Yisen Wang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we first verify the identifiability of tri CL and further evaluate the performance of tri CL on real-world datasets including CIFAR-10, CIFAR-100, and Image Net-100. |
| Researcher Affiliation | Academia | Qi Zhang1 Yifei Wang2 Yisen Wang1,3 1 National Key Lab of General Artificial Intelligence, School of Intelligence Science and Technology, Peking University 2 School of Mathematical Sciences, Peking University 3 Institute for Artificial Intelligence, Peking University |
| Pseudocode | No | No, the paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/PKU-ML/Tri-factor-Contrastive-Learning. |
| Open Datasets | Yes | We pretrain the Res Net-18 on CIFAR-10, CIFAR-100 and Image Net-100 [8] by tri CL. |
| Dataset Splits | No | No, the paper does not provide specific training/validation/test dataset splits needed for full reproducibility. It mentions "standard split" for k-NN evaluation but lacks explicit percentages or counts for training, validation, or test sets. |
| Hardware Specification | No | No, the paper does not provide specific hardware details such as CPU/GPU models, processor types, or memory used for running the experiments. |
| Software Dependencies | No | No, the paper does not provide specific software dependencies with version numbers (e.g., library or framework versions like PyTorch 1.9) needed for reproducibility. |
| Experiment Setup | Yes | We adopt Res Net-18 as the backbone. For CIFAR-10 and CIFAR-100, the projector is a two-layer MLP with hidden dimension 2048 and output dimension 256. And for Image Net-100, the projector is a two-layer MLP with hidden dimension 4096 and output dimension 512. We pretrain the models with batch size 256 and weight decay 0.0001. For CIFAR-10 and CIFAR-100, we pretrain the models for 200 epochs. While for Image Net-100, we pretrain the models for 400 epochs. We use the cosine anneal learning rate scheduler and set the initial learning rate to 0.4 on CIFAR-10, CIFAR-100, and 0.3 on Image Net-100. ... We train the linear classifier on 20 dimensions of the frozen networks for 30 epochs during the linear evaluation. We set batch size to 256 and weight decay to 0.0001. |