Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Image Content Generation with Causal Reasoning
Authors: Xiaochuan Li, Baoyu Fan, Runze Zhang, Liang Jin, Di Wang, Zhenhua Guo, Yaqian Zhao, Rengang Li
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we perform extensive experiments and analyses, including visualizations of the generated content and discussions on the potentials and limitations. |
| Researcher Affiliation | Collaboration | 1Inspur Electronic Information Industry Co.,Ltd. 2Nankai University 3Tsinghua University 4Shandong Massive Information Technology Research Institute |
| Pseudocode | No | No pseudocode or clearly labeled algorithm blocks were found in the paper. |
| Open Source Code | Yes | The code and data are publicly available under the license of CC BY-NC-SA 4.0 for academic and non-commercial usage at: https://github.com/IEIT-AGI/ MIX-Shannon/blob/main/projects/VQAI/lgd_vqai.md. |
| Open Datasets | Yes | Hence, we propose a new image generation task called visual question answering with image (VQAI) and establish a dataset of the same name based on the classic Tom and Jerry animated series. [...] The code and data are publicly available under the license of CC BY-NC-SA 4.0 for academic and non-commercial usage at: https://github.com/IEIT-AGI/ MIX-Shannon/blob/main/projects/VQAI/lgd_vqai.md. |
| Dataset Splits | Yes | In the dataset, we divided 17,524 samples into 15,524, 1,000 and 1,000, corresponding to the training, validation and testing sets. |
| Hardware Specification | Yes | All experiments are run on an A100 8 server. |
| Software Dependencies | No | The paper mentions models used ('T5-XXL', 'stable diffusion', 'Flan-T5-XXL') but does not provide specific version numbers for any ancillary software libraries or frameworks like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | All initial learning rates are set to 3e-5. In the comparison experiments, we use ADAM (Kingma and Ba 2014) as the optimizer. We set the batch size to 16 and the epoch to 20. |