reproducibilityindex.ai

Text Image Inpainting via Global Structure-Guided Diffusion Models

Authors: Shipeng Zhu, Pengfei Fang, Chenjie Zhu, Zuoyan Zhao, Qiang Xu, Hui Xue

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The efficacy of our approach is demonstrated by thorough empirical study, including a substantial boost in both recognition accuracy and image quality.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Southeast University, Nanjing 210096, China 2Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China {shipengzhu, fangpengfei, chenjiezhu, zuoyanzhao, 220232307, hxue}@seu.edu.cn
Pseudocode	No	The paper describes the model architecture and training procedures textually and with diagrams, but does not provide formal pseudocode blocks or algorithms.
Open Source Code	Yes	Code and datasets are available at: https://github.com/blackprotoss/GSDM.
Open Datasets	Yes	Code and datasets are available at: https://github.com/blackprotoss/GSDM. For handwritten text, the TII-HT dataset comprises 40,078 images from the IAM dataset (Marti and Bunke 2002).
Dataset Splits	Yes	For fairness in evaluation, we divide our proposed datasets into distinct training and testing sets, respectively. In the TII-ST dataset... our training set consists of 80,000 synthesized images and 4,877 real images. Meanwhile, the testing set includes 1,599 real images. For the TII-HT dataset, the training set comprises of 38,578 images sourced from IAM, while the testing set contains 1,600 images.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used (e.g., Python, PyTorch versions).
Experiment Setup	No	While image resizing (64x256) is mentioned as a processing step, the paper does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed training configurations.