Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Connecting Neural Models Latent Geometries with Relative Geodesic Representations

Authors: Hanlin Yu, Berfin Inal, Georgios Arvanitidis, Søren Hauberg, Francesco Locatello, Marco Fumero

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate experimentally our method on model stitching and retrieval tasks, covering autoencoders and vision foundation discriminative models, across diverse architectures, datasets, pretraining schemes and modalities.
Researcher Affiliation Academia Hanlin Yu1 University of Helsinki Berfin Inal University of Amsterdam Georgios Arvanitidis DTU Søren Hauberg DTU Francesco Locatello IST Austria Marco Fumero1 IST Austria
Pseudocode Yes Algorithm 1 Relative Geodesic Representations
Open Source Code Yes Code is available at https://github.com/marc0git/Relative Geodesics.
Open Datasets Yes For the following experiment, we trained pairs of convolutional autoencoders (F1, F2) with different initializations on MNIST [Deng, 2012], Fashion MNIST [Xiao et al., 2017], CIFAR10 [Krizhevsky, 2009] datasets. [...] We perform experiments on retrieval tasks on pretrained vision foundation models, investigating how well we can match representations together with different backbones subject to the decoding tasks, on 5 datasets, varying in complexity and size: CIFAR10, CIFAR100 [Krizhevsky, 2009], SVHN [Yuval Netzer et al., 2011], CUB [Wah et al., 2023], and Image Net-1k [Russakovsky et al., 2015].
Dataset Splits Yes Unless otherwise stated, we directly use the original test set of the dataset as the test set, while using 0.9 of the original train set as the train set and the remaining as the validation set. Both the anchors and the Diet data points are selected from the validation set.
Hardware Specification Yes Experiments regarding the geodesic approximation are conducted using NVIDIA A100 GPU and 12 CPU cores. [...] The autoencoder stitching and retrieval experiments were conducted on a single NVIDIA RTX 3080TI GPU. Experiments involving vision foundation models were run on a compute cluster, each job using a single NVIDIA A100 GPU and 10 CPU cores, with runtimes of several hours.
Software Dependencies No True geodesics are computed using Stochman library [Detlefsen et al., 2021], which has Apache-2.0 license, which wraps the decoder into a pullback manifold, intializes a parameterized spline path between codes, and then optimizes its parameters to minimize the Riemannian energy.
Experiment Setup Yes Each model was trained for 50 epochs, reaching convergence, using the Adam optimizer [Kingma and Ba, 2017] with a batch size of 64. We set the learning rate to 0.001, and fixed a random seed of 42 to ensure reproducibility.