Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models
Authors: Mingrui Wu, Xinyue Cai, Jiayi Ji, Jiale Li, Oucheng Huang, Gen Luo, Hao Fei, GUANNAN JIANG, Xiaoshuai Sun, Rongrong Ji
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The results demonstrate that our method exhibits out-of-domain generalization and interpretability. |
| Researcher Affiliation | Collaboration | Mingrui Wu1, Xinyue Cai1, Jiayi Ji1 , Jiale Li1, Oucheng Huang1, Gen Luo1, Hao Fei2, Guannan Jiang3, Xiaoshuai Sun1, Rongrong Ji1 1 Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Ministry of Education of China, Xiamen University, 361005, P.R. China 2 National University of Singapore 3 CATL |
| Pseudocode | No | The paper provides mathematical formulations and descriptions of the approach, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | Yes | Code: https://github. com/mrwu-mac/Control MLLM. |
| Open Datasets | Yes | We follow the setting of Ferret to form 1,748 questions (in which 1,548 for test and 200 for validation) based on LVIS [25] validation dataset, with corresponding box, mask, scribble and point. |
| Dataset Splits | Yes | We follow the setting of Ferret to form 1,748 questions (in which 1,548 for test and 200 for validation) based on LVIS [25] validation dataset, with corresponding box, mask, scribble and point. |
| Hardware Specification | Yes | All experiments are conducted on two RTX 3090 GPUs with 24 GB of memory each. |
| Software Dependencies | No | The paper mentions using 'LLa VA-v1.5-7B [35]' as the MLLM, but it does not specify software dependencies like Python, PyTorch, or CUDA versions. |
| Experiment Setup | Yes | Unless explicitly stated otherwise, the MLLM we use is LLa VA-v1.5-7B [35], T=5, α=400 and β = 0.5. |