3D generation on ImageNet

Authors: Ivan Skorokhodov, Aliaksandr Siarohin, Yinghao Xu, Jian Ren, Hsin-Ying Lee, Peter Wonka, Sergey Tulyakov

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We explore our model on four datasets: SDIP Dogs 2562, SDIP Elephants 2562, LSUN Horses 2562, and Image Net 2562 and demonstrate that 3DGP outperforms the recent state-of-the-art in terms of both texture and geometry quality. [...] 4 EXPERIMENTAL RESULTS
Researcher Affiliation Collaboration Ivan Skorokhodov KAUST Aliaksandr Siarohin Snap Inc. Yinghao Xu CUHK Jian Ren Snap Inc. Hsin-Ying Lee Snap Inc. Peter Wonka KAUST Sergey Tulyakov Snap Inc.
Pseudocode No The paper describes its methods but does not include any pseudocode or algorithm blocks.
Open Source Code Yes Code and visualizations: https://snap-research.github.io/3dgp. [...] Most importantly, we will release 1) the source code and the checkpoints of our generator as a separate github repo; and 2) fully pre-processed datasets used in our work (with the corresponding extracted depth maps).
Open Datasets Yes In our experiments, we use 4 non-aligned datasets: SDIP Dogs (Mokady et al., 2022), SDIP Elephants (Mokady et al., 2022), LSUN Horses (Yu et al., 2015), and Image Net (Deng et al., 2009).
Dataset Splits No The paper discusses filtering datasets but does not explicitly provide specific train/validation/test dataset splits, percentages, or sample counts for each partition required for reproduction.
Hardware Specification Yes This project consumed 12 NVidia A100 GPU years in total.
Software Dependencies No The paper mentions various software components and models (e.g., "Res Net50", "Py MCubes", "timm library") but does not list specific version numbers for the key software dependencies like programming languages or deep learning frameworks.
Experiment Setup Yes We train all the models with Adam optimizer (Kingma & Ba, 2014) using the learning rate of 2e-3 and β1 = 0.0, β2 = 0.99. Following Epi GRAF (Skorokhodov et al., 2022), our model uses patch-wise training with 64 64-resolution patches and uses their proposed β scale sampling strategy without any modifications. We use the batch size of 64 in all the experiments, since in early experiments we didn t find any improvements from using a large batch size neither for our model nor for Style GAN2, as observed by Brock et al. (2018) and Sauer et al. (2022).