Visual Attention Prompted Prediction and Learning
Authors: Yifei Zhang, Bo Pan, Siyi Gu, Guangji Bai, Meikang Qiu, Xiaofeng Yang, Liang Zhao
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four datasets demonstrate the effectiveness of our proposed framework in enhancing predictions for samples both with and without prompt. |
| Researcher Affiliation | Academia | 1Emory University 2Stanford University 3Augusta University {yifei.zhang2, bo.pan, guangji.bai, xyang43, liang.zhao}@emory.edu, sgu33@stanford.edu, qiumeikang@yahoo.com |
| Pseudocode | Yes | Algorithm 1 Alternating Training |
| Open Source Code | Yes | Code and tools are available at https://github.com/yifeizhangcs/ visual-attention-prompt |
| Open Datasets | Yes | We employed four datasets: two from real-world scenarios, sourced from MS COCO [Lin et al., 2014], and two from the medical field, namely LIDC-IDRI (LIDC) [Armato III et al., 2011] and the Pancreas dataset [Roth et al., 2015]. |
| Dataset Splits | Yes | The final dataset included 2625 nodules and 65505 non-nodules images, split into 100/1200/1200 for training, validation, and testing to reflect limited access to human explanations. ... Data was split into 30/30/rest for training, validation, and testing, maintaining class balance. |
| Hardware Specification | Yes | Regarding computational resources, all experiments were executed using an NVIDIA GTX 3090 GPU. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | The experimental setup was standardized with a batch size of 16, and the number of perturbed masks was set to 5000. Furthermore, a pixel conversion probability of 0.1 was established. The training was conducted over 10 epochs, each comprising 5 iterations for the alternating updating phase, effectively resulting in 50 training epochs for each model. The Adam optimization algorithm [Kingma and Ba, 2014] was utilized with a learning rate of 0.0001. |