Towards Global Optimal Visual In-Context Learning Prompt Selection
Authors: Chengming Xu, Chen Liu, Yikai Wang, Yuan Yao, Yanwei Fu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The effectiveness of Partial2Global is validated through experiments on foreground segmentation, single object detection and image colorization, demonstrating that Partial2Global selects consistently better in-context examples compared with other methods, and thus establishes the new state-of-the-arts. |
| Researcher Affiliation | Collaboration | Chengming Xu1 * Chen Liu2 * Yikai Wang1 Q Yuan Yao2 Yanwei Fu1 1Fudan University 2Hong Kong University of Science and Technology {cmxu18, yanweifu}@fdu.edu.cn cliudh@connect.ust.hk yi-kai.wang@outlook.com yuany@ust.hk ... Dr. Xu is now with Tencent Youtulab. |
| Pseudocode | Yes | Algorithm 1 Consistency-aware ranking aggregator Input: Train set Xtrain, query sample xq, trained ranking models {ϕk}, alternative set size K. |
| Open Source Code | Yes | Code is available at https://github.com/chmxu/ranking_in_context.git. |
| Open Datasets | Yes | For foreground segmentation, Pascal-5i [19] is utilized which contains 4 data splits. ... For single object detection, Pascal VOC 2012 [5] is used. ... For colorization, we first sample 50000 training data from ILSVRC2012 [18] training set. |
| Dataset Splits | Yes | For foreground segmentation, Pascal-5i [19] is utilized which contains 4 data splits. ... For colorization, we first sample 50000 training data from ILSVRC2012 [18] training set. Then a test set randomly sampled from the validation set of ILSVRC2012 is used to test the model with mean squared error (MSE) as metric. |
| Hardware Specification | Yes | We utilize 4 V100 gpus to cover all experiments. ... The training of list-wise ranker on the colorization task, which contains about 500000 ranking sequences, takes about 10 hours on 8 V100s. ... During inference on one V100 gpu... |
| Software Dependencies | No | The paper mentions using specific models like 'CLIP [17]' or 'DINO [4]' and optimizers like 'Adam W optimizer [12]', but it does not specify version numbers for these software components or any underlying machine learning frameworks (e.g., PyTorch, TensorFlow) or programming languages (e.g., Python). |
| Experiment Setup | Yes | Adam W optimizer [12] is used with learning rate set as 5e-5 and batch size set as 64. ... Considering the training data size for each task, we adopt different sequence length for ranking. Specifically, we train rank-5 and rank-10 models for foreground segmentation and single object detection, while rank-3 and rank-5 models are trained for colorization. |