Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Content-Style Learning from Unaligned Domains: Identifiability under Unknown Latent Dimensions

Authors: Sagar Shrestha, Xiao Fu

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments corroborate with our theoretical claims. For theory validation, we perform experiments over a series of image translation and generation tasks.
Researcher Affiliation Academia Sagar Shrestha and Xiao Fu School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR 97331, USA EMAIL
Pseudocode No The paper describes methods and formulations but does not include a clearly labeled pseudocode or algorithm block.
Open Source Code No The paper does not contain an explicit statement by the authors about releasing their own source code, nor does it provide a direct link to a repository for the methodology described in the paper. Footnotes refer to baselines or external tools, not the authors' implementation.
Open Datasets Yes We use the AFHQ (Choi et al., 2020) dataset for both multi-domain generation and translation tasks. The Celeb A-HQ dataset (Karras et al., 2018) for both multi-domain generation and translation tasks. The Celeb A dataset (Liu et al., 2015) is used for multi-domain generation task.
Dataset Splits Yes The AFHQ dataset contains images of animal faces in three domains: cat, dog, and wild with 5066, 4679, and 4594 training images, and 494, 492, and 484 testing images. We resize all images to 256 256 for training and testing. The Celeb A-HQ dataset... We split the dataset into two domains based on gender. The male domain contains 18,875 images, whereas the female domain contains 11,025 images. We hold out 1000 images from each domain for testing and use the rest for training. Similar to AFHQ, we resize all images 256 256 for training and testing. The Celeb A dataset... We split the dataset into 7 domains based on the following attributes: Black hair , Blonde hair , Brown hair , Female , Male , Old , and Young . We resize all images to 64 64 for training and testing.
Hardware Specification Yes Finally, the training time (on a single Tesla V100 GPU) of proposed method is at least 22 and 69 hours shorter than the competitive baselines Star GANv2 and I-GAN (Tr), respectively.
Software Dependencies No The paper mentions using Adam and adopting the neural architecture of Style GAN-ADA, but does not specify version numbers for these or other software libraries like PyTorch, Python, or CUDA.
Experiment Setup Yes The hyperparameters used for GAN training in Problem (6) is similar to those used in Style GANADA. Mainly, we use Adam (Kingma & Ba, 2015) with an initial learning rate of 0.0025 with a batch size of 16. The hyperparameters in Adam that control the exponential decay rates of first and second order moments are set to β1 = 0 and β2 = 0.99, respectively. For all datasets, we train the networks for 300,000 iterations. Sparsity regularization weight in Problem (6) is set to 0.3 for all experiments. We follow the optimization procedure in (Karras et al., 2020)2 for GAN Inversion. To summarize, recall the GAN inversion problem for the multi-domain translation task: (bc, bs) = arg min c,s div(q(c, s), x(i)), (31) Eq. 31 is solved using gradient based optimization of c and s using Adam (Kingma & Ba, 2015) with an initial learning rate of 0.1. The hyperparameters of Adam are set to β1 = 0.9, β2 = 0.999. The optimization is carried out for 400 steps.