Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
DINO: A Conditional Energy-Based GAN for Domain Translation
Authors: Konstantinos Vougioukas, Stavros Petridis, Maja Pantic
ICLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the DINO framework on image-to-image translation since this the most typical application for domain-translation systems. Additionally, we tackle the problem of video-driven speech reconstruction, which involves synthesising intelligible speech from silent video. In all of the experiments focus is placed not only on evaluating the quality of the generated samples but also verifying that the semantics are preserved after translation. |
| Researcher Affiliation | Academia | Konstantinos Vougioukas, Stavros Petridis & Maja Pantic Department of Computing, Imperial College London, UK |
| Pseudocode | No | The paper describes the framework and equations, but it does not include a clearly labeled pseudocode block or algorithm steps formatted as such. |
| Open Source Code | Yes | Source code: https://github.com/Dino Man/DINO |
| Open Datasets | Yes | We evaluate the DINO framework on image-to-image translation... on the Celeb AMask-HQ (Lee et al., 2020) and the Cityscapes (Cordts et al., 2016) datasets... Experiments are performed on the GRID dataset (Cooke et al., 2006). |
| Dataset Splits | Yes | We evaluate the DINO framework on image-to-image translation... using their recommended training-test splits. The data is split according to Vougioukas et al. (2019) so that the test set contains unseen speakers and phrases. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions tools and optimizers like 'Adam optimizer (Kingma & Ba, 2015)' and 'Weights and Biases' but does not specify version numbers for programming languages, libraries (e.g., PyTorch, TensorFlow), or other key software dependencies required for replication. |
| Experiment Setup | Yes | The balance parameter γ is set to 0.8 for image-to-image translation experiments. We train using the Adam optimizer (Kingma & Ba, 2015), with a learning rate of 0.0002, and momentum parameters β1 = 0.5, β2 = 0.999. An Adam optimiser is used with a learning rate of 0.0001 for the video-to-audio network and a learning rate of 0.001 for the audio-to-video network. The balancing parameter γ is set to 0.5. |