Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards Global Optimal Visual In-Context Learning Prompt Selection
Authors: Chengming Xu, Chen Liu, Yikai Wang, Yuan Yao, Yanwei Fu
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The effectiveness of Partial2Global is validated through experiments on foreground segmentation, single object detection and image colorization, demonstrating that Partial2Global selects consistently better in-context examples compared with other methods, and thus establishes the new state-of-the-arts. |
| Researcher Affiliation | Collaboration | Chengming Xu1 * Chen Liu2 * Yikai Wang1 Q Yuan Yao2 Yanwei Fu1 1Fudan University 2Hong Kong University of Science and Technology EMAIL EMAIL EMAIL EMAIL ... Dr. Xu is now with Tencent Youtulab. |
| Pseudocode | Yes | Algorithm 1 Consistency-aware ranking aggregator Input: Train set Xtrain, query sample xq, trained ranking models {ϕk}, alternative set size K. |
| Open Source Code | Yes | Code is available at https://github.com/chmxu/ranking_in_context.git. |
| Open Datasets | Yes | For foreground segmentation, Pascal-5i [19] is utilized which contains 4 data splits. ... For single object detection, Pascal VOC 2012 [5] is used. ... For colorization, we first sample 50000 training data from ILSVRC2012 [18] training set. |
| Dataset Splits | Yes | For foreground segmentation, Pascal-5i [19] is utilized which contains 4 data splits. ... For colorization, we first sample 50000 training data from ILSVRC2012 [18] training set. Then a test set randomly sampled from the validation set of ILSVRC2012 is used to test the model with mean squared error (MSE) as metric. |
| Hardware Specification | Yes | We utilize 4 V100 gpus to cover all experiments. ... The training of list-wise ranker on the colorization task, which contains about 500000 ranking sequences, takes about 10 hours on 8 V100s. ... During inference on one V100 gpu... |
| Software Dependencies | No | The paper mentions using specific models like 'CLIP [17]' or 'DINO [4]' and optimizers like 'Adam W optimizer [12]', but it does not specify version numbers for these software components or any underlying machine learning frameworks (e.g., PyTorch, TensorFlow) or programming languages (e.g., Python). |
| Experiment Setup | Yes | Adam W optimizer [12] is used with learning rate set as 5e-5 and batch size set as 64. ... Considering the training data size for each task, we adopt different sequence length for ranking. Specifically, we train rank-5 and rank-10 models for foreground segmentation and single object detection, while rank-3 and rank-5 models are trained for colorization. |