Text Image Inpainting via Global Structure-Guided Diffusion Models
Authors: Shipeng Zhu, Pengfei Fang, Chenjie Zhu, Zuoyan Zhao, Qiang Xu, Hui Xue
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The efficacy of our approach is demonstrated by thorough empirical study, including a substantial boost in both recognition accuracy and image quality. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Southeast University, Nanjing 210096, China 2Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China {shipengzhu, fangpengfei, chenjiezhu, zuoyanzhao, 220232307, hxue}@seu.edu.cn |
| Pseudocode | No | The paper describes the model architecture and training procedures textually and with diagrams, but does not provide formal pseudocode blocks or algorithms. |
| Open Source Code | Yes | Code and datasets are available at: https://github.com/blackprotoss/GSDM. |
| Open Datasets | Yes | Code and datasets are available at: https://github.com/blackprotoss/GSDM. For handwritten text, the TII-HT dataset comprises 40,078 images from the IAM dataset (Marti and Bunke 2002). |
| Dataset Splits | Yes | For fairness in evaluation, we divide our proposed datasets into distinct training and testing sets, respectively. In the TII-ST dataset... our training set consists of 80,000 synthesized images and 4,877 real images. Meanwhile, the testing set includes 1,599 real images. For the TII-HT dataset, the training set comprises of 38,578 images sourced from IAM, while the testing set contains 1,600 images. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used (e.g., Python, PyTorch versions). |
| Experiment Setup | No | While image resizing (64x256) is mentioned as a processing step, the paper does not provide specific hyperparameters (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed training configurations. |