Relative representations enable zero-shot latent space communication

Authors: Luca Moschella, Valentino Maiorca, Marco Fumero, Antonio Norelli, Francesco Locatello, Emanuele RodolĂ 

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We extensively validate the generalization capability of our approach on different datasets, spanning various modalities (images, text, graphs), tasks (e.g., classification, reconstruction) and architectures (e.g., CNNs, GCNs, transformers).
Researcher Affiliation Collaboration Luca Moschella1, Valentino Maiorca1, Marco Fumero1 Antonio Norelli1 Francesco Locatello2, Emanuele Rodol a1 1Sapienza University of Rome 2Amazon Web Services
Pseudocode No The paper describes procedures in prose but does not contain explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code Yes Moreover, we release a well-documented and modular codebase, with the relative representation layer being implemented as a stand-alone Py Torch module. All the checkpoints used in the experiments are versioned with DVC (Kuprieiev et al., 2023) to easily reproduce all the figures and tables.
Open Datasets Yes AE trained on the MNIST dataset several times from scratch. (Figure 1) ... We consider a node classification task on the Cora graph dataset (Sen et al., 2008). ... In this section we consider classification tasks on several datasets, spanning the image domain (Lecun et al., 1998; Xiao et al., 2017; Krizhevsky, 2009) and the graph domain (Yang et al., 2016).
Dataset Splits Yes We first train a reference model that achieves good accuracy on a validation set. (Section 4.2) ... For the computer vision counterpart (Figure 11 and table 8), the procedure is similar but with the following differences: i) the number of anchors is set to 500 to balance between the different encoding dimensions of the two transformers (384 for Vi T-small and 768 for Vi T-base); ii) the subsampling for visualization purposes is done by selecting 4 classes and randomly picking 200 samples for each of them; Evaluation metrics Consider the set of 20k samples S (words for the NLP test, images for the CV one) and the source space X and target space Y and any sample s S, we compute its representation in X and Y through the functions f X : S X and f Y : S Y and define the metrics as follows: (Appendix A.5.1)
Hardware Specification No The paper does not provide specific hardware details such as exact GPU or CPU models, memory specifications, or cloud instance types used for running experiments.
Software Dependencies No The paper lists several software tools and libraries (e.g., Py Torch Lightning, Hugging Face Transformers, DVC, Py Torch Geometric) along with their publication years or project initiation years, but it does not provide explicit version numbers (e.g., 'PyTorch 1.9', 'Python 3.8') for these dependencies.
Experiment Setup Yes The hyperperameters used in Section 4.2 are summarized in Table 10. ... Table 10: The reference model and exhaustive hyperparameter combinations pertaining Section 4.2. Hyperparameter: Seed, Epochs, Number of layers, Dropout Probability, Hidden Activations, Convolution Activation, Optimizer, Learning Rate, Graph Embedder.