Draft-and-Revise: Effective Image Generation with Contextual RQ-Transformer
Authors: Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, WOOK SHIN HAN
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In experiments, our method achieves state-of-the-art results on conditional image generation. We also validate that the Draft-and-Revise decoding can achieve high performance by effectively controlling the quality-diversity trade-off in image generation. |
| Researcher Affiliation | Collaboration | Doyup Lee Kakao Brain doyup.lee@kakaobrain.com Chiheon Kim Kakao Brain chiheon.kim@kakaobrain.com Saehoon Kim Kakao Brain shkim@kakaobrain.com Minsu Cho POSTECH mscho@postech.ac.kr Wook-Shin Han POSTECH wshan@dblab.postech.ac.kr |
| Pseudocode | Yes | Algorithm 1 UPDATE of S Algorithm 2 Draft-and-Revise decoding |
| Open Source Code | No | The paper does not provide an unambiguous statement of code release or a direct link to a source code repository for the methodology described. |
| Open Datasets | Yes | We train Contextual RQ-Transformer with 333M, 821M, and 1.4B parameters on Image Net [7] for class-conditional image generation. ... We train our model with 333M and 654M parameters on CC-3M [32] for text-to-image (T2I) generation... |
| Dataset Splits | No | While Table 1 includes a row for 'Validation Data', the paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) needed to reproduce the data partitioning for training, validation, and testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment. |
| Experiment Setup | Yes | For Draft-and-Revise decoding, we use Tdraft = 64, Trevise = 2, and M = 2. We use temperature scaling [19] and classifier-free guidance [20] only in the revise phase, while none of the strategies are applied in the draft phase. |