Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Risk Bounds for Unsupervised Cross-Domain Mapping with IPMs

Authors: Tomer Galanti, Sagie Benaim, Lior Wolf

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The recent empirical success of unsupervised cross-domain mapping algorithms, in mapping between two domains that share common characteristics, is not well-supported by theoretical justiﬁcations. ... In addition to the empirical validation, we present an upper bound on the generalization risk... The ﬁrst group of experiments is intended to test the validity of the prediction made in Section 6. The next group of experiments is dedicated to Algorithms 1, 2 and 3.
Researcher Affiliation	Collaboration	Tomer Galanti ... School of Computer Science Tel Aviv University ... Israel Sagie Benaim ... School of Computer Science Tel Aviv University ... Israel Lior Wolf ... Facebook AI Research and School of Computer Science Tel Aviv University ... Israel
Pseudocode	Yes	Algorithm 1 Early stopping Algorithm 2 Unsupervised run_then_return_val_loss for Hyperband Algorithm 3 Early stopping (non-unique case)
Open Source Code	No	The paper does not contain an explicit statement by the authors that they are releasing their code for the methodology described in this paper, nor does it provide a direct link to a code repository. It mentions using third-party implementations like "Disco GAN (Kim et al., 2017) ofﬁcial public implementation".
Open Datasets	Yes	The experiment was done on the Celeb A data set... (i) aerial photographs to maps, using data scraped from Google Maps (Isola et al., 2017), (ii) the mapping between photographs from the cityscapes data set and their per-pixel semantic labels (Cordts et al., 2016), (iii) architectural photographs to their labels from the CMP Facades data set (Radim Tyleˇcek, 2013), (iv) handbag images (Zhu et al., 2016) to their binary edge images... and (v) a similar data set for the shoe images from (Yu and Grauman, 2014).
Dataset Splits	No	The paper mentions 'training samples' and 'test data' (e.g., 'The scores are averaged over 20 trials and the expectations Ex DA are estimated using the test data.'), and 'unmatched' sets, but does not specify explicit percentages or counts for training, validation, and test splits needed to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide specific details about the hardware used, such as GPU models, CPU types, or memory configurations for running the experiments.
Software Dependencies	No	The paper mentions various GAN architectures and metrics (e.g., Cycle GAN, Disco GAN, Distance GAN, UNIT, WGAN, LPIPS) and neural networks (VGG), but it does not specify version numbers for these software components or any underlying programming languages or frameworks like Python or PyTorch.
Experiment Setup	Yes	The published hyperparameters for each data set are used, except when using Hyperband, where we vary the number of layers, the learning rate and the batch size... learning rate (between 10^-5 to 1), the number of kernels per layer (between 10 and 300), and the weight between circularity losses and the GANs (between 10^-5 and 1). ... Require: SA and SB: training samples; H : a hypothesis class; C : a class of discriminators; c: a tolerance scale; k: maximal depth; λ: a trade-off parameter; T: a ﬁxed number of epochs.