Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching

Authors: Sucheng Ren, Qihang Yu, Ju He, Xiaohui Shen, Alan Yuille, Liang-Chieh Chen

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the effectiveness of Flow AR on the challenging Image Net-256 benchmark, demonstrating superior generation performance compared to previous methods. Codes is available at https://github.com/Oliver Rensu/Flow AR. ... We demonstrate Flow AR s effectiveness on the challenging Image Net benchmark (Deng et al., 2009), where it achieves state-of-the-art results. ... In this section, we present our main results on the challenging Image Net-256 and Image Net-512 generation benchmark (Deng et al., 2009) (Sec. 4.1), followed by ablation studies (Sec. 4.2).
Researcher Affiliation	Collaboration	Sucheng Ren 1 Qihang Yu 2 Ju He 2 Xiaohui Shen 2 Alan Yuille 1 Liang-Chieh Chen 2 1Johns Hopkins University 2Byte Dance
Pseudocode	No	The paper describes the methodology in prose and mathematical equations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Codes is available at https://github.com/Oliver Rensu/Flow AR.
Open Datasets	Yes	We validate the effectiveness of Flow AR on the challenging Image Net-256 benchmark... We demonstrate Flow AR s effectiveness on the challenging Image Net benchmark (Deng et al., 2009)
Dataset Splits	No	The paper uses the ImageNet benchmark for class-conditional image generation and mentions 'Following the settings in VAR (Tian et al., 2024)', but does not explicitly provide details about training, validation, or test dataset splits, nor does it specify using standard ImageNet splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions general software components like 'Transformer architectures (Vaswani, 2017)' and the 'Adam W' optimizer but does not provide specific software dependencies with version numbers (e.g., library names with versions).
Experiment Setup	Yes	We list the hyper-parameters of our Flow AR in Table 6. training hyper-parameters optimizer Adam W warmup epochs 100 total epochs 400 batch size 1024 peak learning rate 2e-4 minimal learning rate 1e-5 learning rate schedule cosine class label dropout rate 0.1 max gradient norm 1.0. In Table 7, we provide four kinds of different configurations of Flow AR for a fair comparison under similar parameters with VAR (Tian et al., 2024). The proposed Flow AR contains two main modules: Autoregressive Model and Flow Matching Model, both build on top of Transformer architectures (Vaswani, 2017).