Image Content Generation with Causal Reasoning
Authors: Xiaochuan Li, Baoyu Fan, Runze Zhang, Liang Jin, Di Wang, Zhenhua Guo, Yaqian Zhao, Rengang Li
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we perform extensive experiments and analyses, including visualizations of the generated content and discussions on the potentials and limitations. |
| Researcher Affiliation | Collaboration | 1Inspur Electronic Information Industry Co.,Ltd. 2Nankai University 3Tsinghua University 4Shandong Massive Information Technology Research Institute |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | Yes | The code and data are publicly available under the license of CC BY-NC-SA 4.0 for academic and non-commercial usage at: https://github.com/IEIT-AGI/ MIX-Shannon/blob/main/projects/VQAI/lgd_vqai.md. |
| Open Datasets | Yes | Hence, we propose a new image generation task called visual question answering with image (VQAI) and establish a dataset of the same name based on the classic Tom and Jerry animated series. [...] The code and data are publicly available under the license of CC BY-NC-SA 4.0 for academic and non-commercial usage at: https://github.com/IEIT-AGI/ MIX-Shannon/blob/main/projects/VQAI/lgd_vqai.md. |
| Dataset Splits | Yes | In the dataset, we divided 17,524 samples into 15,524, 1,000 and 1,000, corresponding to the training, validation and testing sets. |
| Hardware Specification | Yes | All experiments are run on an A100 8 server. |
| Software Dependencies | No | The paper mentions models used ('T5-XXL', 'stable diffusion', 'Flan-T5-XXL') but does not provide specific version numbers for any ancillary software libraries or frameworks like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | All initial learning rates are set to 3e-5. In the comparison experiments, we use ADAM (Kingma and Ba 2014) as the optimizer. We set the batch size to 16 and the epoch to 20. |