Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Locally Hierarchical Auto-Regressive Modeling for Image Generation
Authors: Tackgeun You, Saehoon Kim, Chiheon Kim, Doyup Lee, Bohyung Han
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our models, referred to as HQ-TVAE hereafter, on class and text-conditional image generation tasks. |
| Researcher Affiliation | Collaboration | Tackgeun You3,5 EMAIL Saehoon Kim4 EMAIL Chiheon Kim4 EMAIL Doyup Lee4 EMAIL Bohyung Han1,2,3 EMAIL 1ECE, 2IPAI, 3AIIS, Seoul National University, Korea 4Kakao Brain, Korea 5CSE, POSTECH, Korea |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] We report in both the paper and the supplementary material. |
| Open Datasets | Yes | We train our models on 1.2M images in the train split of Image Net [29] for class-conditional image generation. For text-conditional tasks, we employ 15M image-text pairs in Conceptual Caption (CC) [30] and Conceptual-12M [31]. |
| Dataset Splits | Yes | We train our models on 1.2M images in the train split of Image Net [29] for class-conditional image generation. For text-conditional tasks, we employ 15M image-text pairs in Conceptual Caption (CC) [30] and Conceptual-12M [31]. ... Table 2: Text-conditional image generation performance on the CC3M validation set. ... Table 3: Comparison of image reconstruction quality on the Image Net validation set. |
| Hardware Specification | Yes | We measure the throughput of sample generation on a single Tesla A100 GPU. |
| Software Dependencies | No | The paper describes network structures and normalization techniques but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The scaling factor in HQ-VAE is set to 2, i.e., r = 2, to produce the top and bottom codes of 8 8 and 16 16 resolution by default, respectively. We test three versions of HQ-Transformer by varying hyperparameters: (a) NMT = 12, and NPHT = 4 for the smallest model, (b) NMT = 24 and NPHT = 4 for the mid-size model, and (c) NMT = 42, and NPHT = 6 for our largest model. In all cases, NIET = 1 and other parameters related to the network size are fixed. |