reproducibilityindex.ai

You Only Train Once: Loss-Conditional Training of Deep Networks

Authors: Alexey Dosovitskiy, Josip Djolonga

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed method both quantitatively and qualitatively on three problems with multiterm loss functions: β-VAE, learned image compression, and fast style transfer.4 EXPERIMENTS
Researcher Affiliation	Industry	Alexey Dosovitskiy & Josip Djolonga Google Research, Brain Team {adosovitskiy, josipd}@google.com
Pseudocode	No	The paper describes the method in prose and equations but does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	The code will be released at www.github.com/google-research/google-research/yoto.
Open Datasets	Yes	We consider two settings: the CIFAR-10 dataset (Krizhevsky, 2009) with Gaussian outputs, and the Shapes3D dataset (Burgess & Kim, 2018) with Bernoulli outputs. We evaluate the compression models on two datasets: Kodak (Kodak, 1993) and Tecnick (Asuni & Giachetti, 2014). We sample the content images form Image Net (Deng et al., 2009) and use 14 pointillism paintings as the style images.
Dataset Splits	Yes	We select the ﬁxed β so that it minimizes the average validation loss over all β values. Figure 7: Qualitative comparison of image stylization models on an image from the validation set of Image Net.
Hardware Specification	No	The paper mentions "on a single CPU core" in the context of timing a specific comparison, but it does not provide specific hardware details (like GPU models or CPU models with speeds) for the main experiments.
Software Dependencies	No	The paper mentions non-linearities and optimization techniques but does not provide specific version numbers for software dependencies or libraries used (e.g., PyTorch, TensorFlow, Python versions).
Experiment Setup	Yes	On Shapes3D we train all models for a total of 600,000 mini-batch iterations, and we multiply the learning rate by 0.5 after 300,000, 390,000, 480,000, and 570,000 iterations. We tuned the learning rates by sweeping over the values {5·10-5, 1·10-4, 2·10-4, 4·10-4, 8·10-4} and ended up using the learning rates 1·10-4 on CIFAR-10 and 2·10-4 on Shapes3D. We use mini-batches of 128 samples on CIFAR-10 and 64 samples on Shapes3D. We use weight decay of 10-5 in all models.