reproducibilityindex.ai

DINO: A Conditional Energy-Based GAN for Domain Translation

Authors: Konstantinos Vougioukas, Stavros Petridis, Maja Pantic

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the DINO framework on image-to-image translation since this the most typical application for domain-translation systems. Additionally, we tackle the problem of video-driven speech reconstruction, which involves synthesising intelligible speech from silent video. In all of the experiments focus is placed not only on evaluating the quality of the generated samples but also verifying that the semantics are preserved after translation.
Researcher Affiliation	Academia	Konstantinos Vougioukas, Stavros Petridis & Maja Pantic Department of Computing, Imperial College London, UK
Pseudocode	No	The paper describes the framework and equations, but it does not include a clearly labeled pseudocode block or algorithm steps formatted as such.
Open Source Code	Yes	Source code: https://github.com/Dino Man/DINO
Open Datasets	Yes	We evaluate the DINO framework on image-to-image translation... on the Celeb AMask-HQ (Lee et al., 2020) and the Cityscapes (Cordts et al., 2016) datasets... Experiments are performed on the GRID dataset (Cooke et al., 2006).
Dataset Splits	Yes	We evaluate the DINO framework on image-to-image translation... using their recommended training-test splits. The data is split according to Vougioukas et al. (2019) so that the test set contains unseen speakers and phrases.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies	No	The paper mentions tools and optimizers like 'Adam optimizer (Kingma & Ba, 2015)' and 'Weights and Biases' but does not specify version numbers for programming languages, libraries (e.g., PyTorch, TensorFlow), or other key software dependencies required for replication.
Experiment Setup	Yes	The balance parameter γ is set to 0.8 for image-to-image translation experiments. We train using the Adam optimizer (Kingma & Ba, 2015), with a learning rate of 0.0002, and momentum parameters β1 = 0.5, β2 = 0.999. An Adam optimiser is used with a learning rate of 0.0001 for the video-to-audio network and a learning rate of 0.001 for the audio-to-video network. The balancing parameter γ is set to 0.5.