Zero-shot Synthesis with Group-Supervised Learning

Authors: Yunhao Ge, Sami Abu-El-Haija, Gan Xin, Laurent Itti

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test our model and learning framework on existing benchmarks, in addition to a new dataset that we open-source. We qualitatively and quantitatively demonstrate that GZS-Net trained with GSL outperforms state-of-the-art methods.
Researcher Affiliation Academia Yunhao Ge, Sami Abu-El-Haija, Gan Xin, Laurent Itti University of Southern California yunhaoge@usc.edu, sami@haija.org, gxin@usc.edu, itti@usc.edu
Pseudocode Yes Algorithm 1: Training Regime; for sampling data and calculating loss terms
Open Source Code No The paper states: 'We provide a new dataset, Fonts1, with its generating code.' (in Section 1 and Section 5.1, along with a URL for the dataset). This refers to the code for generating the Fonts dataset, not the source code for the GZS-Net methodology itself. There is no explicit statement or link provided for the GZS-Net implementation code.
Open Datasets Yes We test our model and learning framework on existing benchmarks, in addition to a new dataset that we open-source. We provide a new dataset, Fonts1, with its generating code. It contains 1.56 million images and their attributes. Its simplicity allows rapid idea prototyping for learning disentangled representations. You can download the dataset and its generating code from: http://ilab.usc.edu/datasets/fonts. i Lab-20M (Borji et al., 2016): is an attributed dataset... Ra FD (Radboud Faces Database, Langner et al., 2010): contains pictures... d Sprites (Matthey et al., 2017) is a dataset of 2D shapes...
Dataset Splits No The paper specifies train/test splits for the datasets used (e.g., 'We use a train:test split of 75:25' for Fonts and d Sprites, and 'We use a 80:20 split for train:test' for Ra FD), but it does not explicitly mention or detail a validation dataset split.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or specific cloud/cluster configurations used for running the experiments.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA 11.x) that were used to implement and run their experiments.
Experiment Setup Yes For all experiments, the encoder E is composed of two convolutional layers with stride 2, followed by 3 residual blocks, followed by a convolutional layer with stride 2, followed by reshaping the response map to a vector, and finally two fully-connected layers to output 100-dim vector as latent feature. The decoder D mirrors the encoder, and is composed of two fully-connected layers, followed by reshape into cuboid, followed by de-conv layer with stride 2, followed by 3 residual blocks, then finally two de-conv layers with stride 2, to output a synthesized image. We partition the 100-d latents equally among the 5 attributes.