Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
U-DiTs: Downsample Tokens in U-Shaped Diffusion Transformers
Authors: Yuchuan Tian, Zhijun Tu, Hanting Chen, Jie Hu, Chao Xu, Yunhe Wang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to demonstrate the extraordinary performance of U-Di T models. |
| Researcher Affiliation | Collaboration | 1 State Key Lab of General AI, School of Intelligence Science and Technology, Peking University. 2 Huawei Noah s Ark Lab. |
| Pseudocode | No | The paper describes methods in prose and includes architectural diagrams (Figure 3) but does not contain a formal pseudocode or algorithm block. |
| Open Source Code | Yes | Codes are available at https://github.com/YuchuanTian/U-DiT. |
| Open Datasets | Yes | The training is conducted with the training set of Image Net 2012 [12]. |
| Dataset Splits | No | The paper states 'The training is conducted with the training set of Image Net 2012 [12]' and evaluates on 'Image Net 256 256' and 'Image Net 512 512', but does not explicitly provide percentages or sample counts for training, validation, and test splits. |
| Hardware Specification | Yes | We used 8 NVIDIA A100s (80G) to train U-Di T-B and U-Di T-L models. |
| Software Dependencies | No | The paper mentions using 'sd-vae-ft-ema', 'Adam W optimizer', 'Mind Spore', 'CANN', and 'Ascend AI Processor', but does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | The same VAE (i.e. sd-vae-ft-ema) for latent diffusion models [29] and the Adam W optimizer is adopted. The training hyperparameters are kept unchanged, including global batch size 256, learning rate 1e 4, weight decay 0, and global seed 0. |