Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language
Authors: Seonghyeon Nam, Yunji Kim, Seon Joo Kim
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our method outperforms existing methods on CUB and Oxford-102 datasets, and our results were mostly preferred on a user study. |
| Researcher Affiliation | Academia | Seonghyeon Nam, Yunji Kim, and Seon Joo Kim Yonsei University {shnnam,kim_yunji,seonjookim}@yonsei.ac.kr |
| Pseudocode | No | The paper describes the model architecture and training process in prose and with diagrams, but it does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about releasing source code, nor does it provide a link to a code repository. |
| Open Datasets | Yes | We evaluated our method on CUB dataset [18] and Oxford-102 dataset [19], which are well-known public datasets. |
| Dataset Splits | No | The paper states that experiments were conducted on CUB and Oxford-102 datasets and mentions using a 'test set', but it does not provide specific details on the train, validation, and test splits (e.g., percentages or exact sample counts) or explicitly refer to a predefined standard split. |
| Hardware Specification | No | The paper mentions using PyTorch for implementation but does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using PyTorch, fastText word vectors, and Adam optimizer, but it does not specify any version numbers for these software components. |
| Experiment Setup | Yes | We trained our network 600 epochs using Adam optimizer [29] with the learning rate of 0.0002, the momentum of 0.5, and the batch size of 64. Also, we decreased the learning rate by 0.5 for every 100 epochs. For data augmentation, we used random cropping, flipping, and rotation. We resized images to 136 136 and randomly cropped 128 128 patches. The random rotation ranged from -10 to 10 degrees. We set λ1 and λ2 to 10 and 2 respectively considering both the visual quality and the training stability. |