Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Shared Representations from Unpaired Data
Authors: Amitai Yacobi, Nir Ben-Ari, Ronen Talmon, Uri Shaham
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results in computer vision and natural language processing domains support its potential, revealing the effectiveness of unpaired data in capturing meaningful cross-modal relations, demonstrating high capabilities in retrieval tasks, generation, arithmetics, zero-shot, and cross-domain classification. |
| Researcher Affiliation | Collaboration | Amitai Yacobi Department of Computer Science Bar-Ilan University Ramat-Gan, Israel EMAIL Nir Ben-Ari Department of Computer Science Bar-Ilan University Ramat-Gan, Israel EMAIL Ronen Talmon Electrical and Computer Engineering Technion Haifa, Israel EMAIL Uri Shaham Department of Computer Science Bar-Ilan University Ramat-Gan, Israel EMAIL Equal contribution, random order. Also at Moodify.ai, Kefar Saba, Israel |
| Pseudocode | Yes | Algorithm 1: Spectral Universal Embedding (SUE) |
| Open Source Code | Yes | Our code, including SUE implementation and all experiments, is available at https://github.com/shaham-lab/SUE. |
| Open Datasets | Yes | Datasets. To evaluate SUE s performance, we use several paired datasets. To provide informative qualitative results and facilitate an intuitive understanding of the universal embedding concept, we use three vision-language datasets ( ): Flickr30k [59], MSCOCO [50], and Polyvore [31]; as well as a vision-vision dataset ( ): Edges2Shoes [37]; and a tabular-to-tabular dataset ( ): Handwritten [20]. For the generation task, we use another vision-language dataset ( ): caption-FFHQ [40]. |
| Dataset Splits | Yes | Data Split. For each dataset, we excluded 400 paired samples for evaluation, using the remaining samples for training. To train the parametric SE model, the training set was further divided into a 90% training subset and a 10% validation subset. Similarly, during the training of the MMD network, the training set was partitioned into a 90% training subset and a 10% validation subset. |
| Hardware Specification | Yes | E.9 OS and Hardware The training procedures were executed on Rocky Linux 9.3, utilizing Nvidia GPUs including Ge Force GTX 1080 Ti and A100 80GB PCIe. |
| Software Dependencies | No | The paper mentions using "Py Torch" for the MMD loss implementation and "scikit-learn" for CCA, but does not provide specific version numbers for these software components in the text. |
| Experiment Setup | Yes | E.8 Hyper-parameters Numeric SE. For the numeric SE, we constructed the graph as outlined in Sec. E.5, using k = 100 for each point. The Laplacian matrix used is the random walk Laplacian, and we selected 10 eigenvectors, which correspond to the output dimension of the SE. Parametric SE. Computing the SE using the parametric approach involves training a neural network. The network architecture consists of an MLP with hidden layers of sizes 4096, 4096, and 1024 for both networks (one per modality). We set the batch size to 4096, and the learning rate to 10 4 with a decay factor of 0.1, and the training was run for 100 epochs. The optimizer used is Adam, and the learning rate is adjusted using the Py Torch Reduce LROn Plateau scheduler with a patience of 10. We used the same graph construction as used for the numeric SE, outlined in Sec. E.5. CCA. As discussed in Sec. 4.2, the CCA projections are calculated using a small subset of paired samples, with the number of pairs fixed at 600 for all datasets. Fig. 5 presents an experiment demonstrating the impact of the amount of paired data on the results. For these projections, we utilized the CCA implementation from scikit-learn7, with the number of components set to 8 across all datasets. Residual Network (MMD). To train the residual network, we employed an MLP architecture with hidden layers of size 128, 128, and 128, incorporating a residual connection from the input to the output. The optimizer used is Adam W, with a learning rate set to 10 3. The network was trained for 100 epochs. Auto-Encoder. To train the Auto-Encoder network, we employed an MLP encoder architecture identical to those of the parametric SE MLP, with a corresponding decoder architecture. The optimizer used is Adam W, with a learning rate set to 10 3. The network was trained for 100 epochs. |