DINO: A Conditional Energy-Based GAN for Domain Translation
Authors: Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the DINO framework on image-to-image translation since this the most typical application for domain-translation systems. Additionally, we tackle the problem of video-driven speech reconstruction, which involves synthesising intelligible speech from silent video. In all of the experiments focus is placed not only on evaluating the quality of the generated samples but also verifying that the semantics are preserved after translation. |
| Researcher Affiliation | Academia | Konstantinos Vougioukas, Stavros Petridis & Maja Pantic Department of Computing, Imperial College London, UK |
| Pseudocode | No | The paper describes the framework and equations, but it does not include a clearly labeled pseudocode block or algorithm steps formatted as such. |
| Open Source Code | Yes | Source code: https://github.com/Dino Man/DINO |
| Open Datasets | Yes | We evaluate the DINO framework on image-to-image translation... on the Celeb AMask-HQ (Lee et al., 2020) and the Cityscapes (Cordts et al., 2016) datasets... Experiments are performed on the GRID dataset (Cooke et al., 2006). |
| Dataset Splits | Yes | We evaluate the DINO framework on image-to-image translation... using their recommended training-test splits. The data is split according to Vougioukas et al. (2019) so that the test set contains unseen speakers and phrases. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions tools and optimizers like 'Adam optimizer (Kingma & Ba, 2015)' and 'Weights and Biases' but does not specify version numbers for programming languages, libraries (e.g., PyTorch, TensorFlow), or other key software dependencies required for replication. |
| Experiment Setup | Yes | The balance parameter γ is set to 0.8 for image-to-image translation experiments. We train using the Adam optimizer (Kingma & Ba, 2015), with a learning rate of 0.0002, and momentum parameters β1 = 0.5, β2 = 0.999. An Adam optimiser is used with a learning rate of 0.0001 for the video-to-audio network and a learning rate of 0.001 for the audio-to-video network. The balancing parameter γ is set to 0.5. |