Contrastive Multi-Task Dense Prediction

Authors: Siwei Yang, Hanrong Ye, Dan Xu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on two challenging datasets (i.e. NYUD-v2 and Pascal-Context) clearly demonstrate the superiority of the proposed multi-task contrastive learning approach for dense predictions, establishing new state-of-the-art performances.
Researcher Affiliation Academia Siwei Yang1,2, Hanrong Ye2, Dan Xu2 1 Key Laboratory of Embedded System and Service Computing, Tongji University 2 Hong Kong University of Science and Technology swyang.ac@gmail.com, hyeae@cse.ust.hk, danxu@cse.ust.hk
Pseudocode No The paper describes its method using text and diagrams but does not provide pseudocode or explicitly labeled algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code, such as a specific repository link, an explicit code release statement, or mention of code in supplementary materials.
Open Datasets Yes The experiments are extensively conducted on two widely used multi-task dense prediction datasets. One is NYUD-v2 (Silberman et al. 2012)... The other one is Pascal-Context (Everingham et al. 2010)...
Dataset Splits Yes One is NYUD-v2 (Silberman et al. 2012) which contains 1,449 RGBD indoor scene images ... with 795 images for training and 654 images for testing. The other one is Pascal-Context (Everingham et al. 2010) which has 4,998 training and 5,105 testing images...
Hardware Specification Yes We train each model using Adam optimizer with a batch size of 4 on 2 GPUs (i.e. NVIDIA RTX 3090) for the NYUD-v2 dataset, and a batch size of 6 on 6 GPUs for the PASCAL-Context dataset.
Software Dependencies No The paper mentions using 'Adam optimizer' and 'HRNet18' but does not specify software dependencies with version numbers (e.g., 'PyTorch 1.9' or 'CUDA 11.1').
Experiment Setup Yes We train each model using Adam optimizer with a batch size of 4 on 2 GPUs (i.e. NVIDIA RTX 3090) for the NYUD-v2 dataset, and a batch size of 6 on 6 GPUs for the PASCAL-Context dataset. The base learning rate, momentum, and weight decay are set to 2e-4, 0.9, and 1e-4, respectively. The learning rate is linearly warmed up for 1 epoch. The margin m, sampling ratio γ, top-k factor k, and contrastive loss weight λcon are by default set to 0.2, 0.01, 128, and 1.0 respectively in the evaluation experiments. The number of negative samples in one triplet is 16.