CIC: A Framework for Culturally-Aware Image Captioning
Authors: Youngsik Yun, Jihie Kim
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our human evaluation conducted on 45 participants from 4 different cultural groups with a high understanding of the corresponding culture shows that our proposed framework generates more culturally descriptive captions when compared to the image captioning baseline based on VLPs. Resources can be found at https://shane3606.github. io/cic. |
| Researcher Affiliation | Academia | Youngsik Yun 1 and Jihie Kim 2 1 Department of Computer Science and Artificial Intelligence, Dongguk University 2 Division of AI Software Convergence, Dongguk University |
| Pseudocode | No | The paper describes the method using textual descriptions and a diagram, but it does not include pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | Resources can be found at https://shane3606.github. io/cic. |
| Open Datasets | Yes | We validated our framework using GD-VCR [Yin et al., 2021], a multiple-choice QA testing set designed to evaluate the ability of multi-modal models to understand geo-diverse commonsense knowledge. |
| Dataset Splits | Yes | We validated our framework using GD-VCR [Yin et al., 2021], a multiple-choice QA testing set designed to evaluate the ability of multi-modal models to understand geo-diverse commonsense knowledge. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions Chat GPT and BLIP2 but does not specify their version numbers or other software dependencies with explicit versions. |
| Experiment Setup | Yes | The temperature is set to 0.6, and the maximum length for caption generation is 100. |