Rank-N-Contrast: Learning Continuous Representations for Regression
Authors: Kaiwen Zha, Peng Cao, Jeany Son, Yuzhe Yang, Dina Katabi
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments using five real-world regression datasets that span computer vision, human-computer interaction, and healthcare verify that RNC achieves state-of-the-art performance, highlighting its intriguing properties including better data efficiency, robustness to spurious targets and data corruptions, and generalization to distribution shifts. Code is available at: https://github.com/kaiwenzha/Rank-N-Contrast. |
| Researcher Affiliation | Academia | Kaiwen Zha1, Peng Cao1, Jeany Son2 Yuzhe Yang1 Dina Katabi1 1MIT CSAIL 2GIST |
| Pseudocode | No | The paper provides mathematical formulations of the Rank-N-Contrast Loss (LRNC) but does not include a pseudocode block or a clearly labeled algorithm. |
| Open Source Code | Yes | Code is available at: https://github.com/kaiwenzha/Rank-N-Contrast. |
| Open Datasets | Yes | Age DB (Age) [32, 44] is a dataset for predicting age from face images, containing 16,488 in-the-wild images of celebrities and the corresponding age labels. TUAB (Brain-Age) [34, 11] aims for brain-age estimation from EEG resting-state signals, with 1,385 21-channel EEG signals sampled at 200Hz from individuals with age from 0 to 95. MPIIFace Gaze (Gaze Direction) [51, 52] contains 213,659 face images collected from 15 participants during natural everyday laptop use. Sky Finder (Temperature) [31, 7] contains 35,417 images captured by 44 outdoor webcam cameras for in-the-wild temperature prediction. IMDB-WIKI (Age) [36, 44] is a large dataset for predicting age from face images, which contains 523,051 celebrity images and the corresponding age labels. |
| Dataset Splits | Yes | Age DB... It is split into a 12,208-image training set, a 2140-image validation set, and a 2140-image test set. MPIIFace Gaze... We subsample and split it into a 33,000-image training set, a 6,000-image validation set, and a 6,000image test set with no overlapping participants. Sky Finder... It is split into a 28,373-image training set, a 3,522-image validation set, and a 3,522-image test set. IMDB-WIKI... We subsample the dataset to create a variable size training set, and keep the validation set and test set unchanged with 11,022 images in each. |
| Hardware Specification | Yes | All experiments are trained using 8 NVIDIA TITAN RTX GPUs. |
| Software Dependencies | No | The paper mentions using "SGD optimizer" and "cosine learning rate annealing" but does not specify software or library versions (e.g., PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The batch size is set to 256. For one-stage methods and encoder training of two-stage methods, we select the best learning rates and weight decays for each dataset by grid search, with a grid of learning rates from {0.01, 0.05, 0.1, 0.2, 0.5, 1.0} and weight decays from {10 6, 10 5, 10 4, 10 3}. For the predictor training of two-stage methods, we adopt the same search setting as above except for adding no weight decay to the search choices of weight decays. For temperature parameter τ, we search from {0.1, 0.2, 0.5, 1.0, 2.0, 5.0} and select the best, which is 2.0. We train all one-stage methods and the encoder of two-stage methods for 400 epochs, and the linear regressor of two-stage methods for 100 epochs. |