Convolutional Visual Prompt for Robust Visual Perception

Authors: Yun-Yun Tsai, Chengzhi Mao, Junfeng Yang

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments and analysis on a wide variety of OOD visual perception tasks show that our approach is effective, improving robustness by up to 5.87% over several large-scale models.
Researcher Affiliation Academia Yun-Yun Tsai Columbia University yunyuntsai@cs.columbia.edu ChengZhi Mao Columbia University mcz@cs.columbia.edu Junfeng Yang Columbia University junfeng@columbia.edu
Pseudocode Yes Algorithm 1: Convolutional Visual Prompt ... Algorithm 2: Low-Rank Visual Prompt
Open Source Code No The paper states "*We generate all types of corruption data based on the Git Hub code: https://github.com/bethgelab/imagecorruptions*" (Section 4.1). This refers to code for dataset generation, not the CVP methodology itself, and there is no explicit statement about releasing the CVP code.
Open Datasets Yes We evaluate our method on five kinds of OOD datasets, including CIFAR-10-C [21], Image Net-C [46], Image Net-R [19], Image Net-Sketch [65], and Image Net-A [23].
Dataset Splits No The paper mentions "The choice of hyper-parameter setting on λ ranges and kernel ... which does not require a validation set." (Section 7.4). While it implies there *could* be a validation set, it does not specify a distinct validation split or its size for reproduction. It primarily details train and test set usage.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU or CPU models used for running the experiments.
Software Dependencies No The paper mentions general software components common for deep learning, but does not provide specific version numbers for Python, PyTorch, CUDA, or any other libraries or solvers required for replication.
Experiment Setup Yes For the training part of SSL model, we set the training parameters with batch size as 64, training epoch as 200, and the learning rate (lr) as 0.001. The lr is decayed with a cosine annealing for each batch [37]. ... For the test-time adaptation part, we set the range of parameter δ for VP. For the ℓ2-norm perturbations, the ϵ is [-8/255,8/255] and the step size is 2/255. We set the iteration number i either as 1 or 5...