DenoiseRep: Denoising Model for Representation Learning

Authors: zhengrui Xu, Guan'an Wang, Xiaowen Huang, Jitao Sang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on various discriminative vision tasks, including re-identification (Market-1501, Duke MTMC-re ID, MSMT17, CUHK-03, vehicle ID), image classification (Image Net, UB200, Oxford-Pet, Flowers), object detection (COCO), image segmentation (ADE20K) show stability and impressive improvements.
Researcher Affiliation Academia Zhengrui Xu1 zrxu23@bjtu.edu.cn Guan an Wang guan.wang0706@gmail.com Xiaowen Huang 1,2,3 xwhuang@bjtu.edu.cn Jitao Sang 1,2,3 jtsang@bjtu.edu.cn 1School of Computer Science and Technology, Beijing Jiaotong University 2Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University 3Key Laboratory of Big Data & Artificial Intelligence in Transportation(Beijing Jiaotong University), Ministry of Education
Pseudocode Yes Algorithm 1 Training Input: The number of feature layers in the backbone N, features extracted from each layer {Fi}N i=1, the denoising module that needs to be trained {Di( )}N i=1. 1: repeat 2: for each i [N, 1] do 3: t = i: Specify the diffusion step t for the current layer based on the order of layers. 4: ϵ N(0, I): Randomly sample a Gaussian noise. 5: Xt = at Fi + 1 atϵ: Forward diffusion process in Eq.(2). 6: Take gradient descent step on θ ϵ Di(Xt, t) 2 7: end for 8: until converged
Open Source Code Yes Code is available at https://github.com/wangguanan/Denoise Rep.
Open Datasets Yes Experimental results on various discriminative vision tasks, including re-identification (Market-1501, Duke MTMC-re ID, MSMT17, CUHK-03, vehicle ID), image classification (Image Net, UB200, Oxford-Pet, Flowers), object detection (COCO), image segmentation (ADE20K) show stability and impressive improvements.
Dataset Splits Yes We conduct training and evaluation on four datasets: Duke MTMCre ID [72], Market-1501 [70], MSMT17 [61], and CUHK-03 [33]. These datasets encompass a wide range of scenarios for person re-identification. For accuracy, we use standard metrics including Rank-1 curves (The probability that the image with the highest confidence in the search results is the correct result.) and mean average precision (MAP). All the results are from a single query setting.
Hardware Specification Yes We implement our method using Python on a server equipped with a 2.10GHz Intel Core Xeon (R) Gold 5218R processor and two NVIDIA RTX 3090 GPUs.
Software Dependencies No The paper only mentions 'Python' without specifying its version or any other software libraries and their versions.
Experiment Setup Yes The epochs we trained are set to 120, the learning rate is set to 0.0004, the batch size during training is 64, the inference stage is 256, and the diffusion step size t is set to 1000.