Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Identity-Preserving Transformations on Data Manifolds

Authors: Marissa Catherine Connor, Kion Fallah, Christopher John Rozell

TMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on MNIST and Fashion MNIST highlight our model s ability to learn identity-preserving transformations on multi-class datasets. Additionally, we train on Celeb A to showcase our model s ability to learn semantically meaningful transformations on complex datasets in an unsupervised manner.
Researcher Affiliation	Academia	Marissa Connor , Kion Fallah , Christopher J. Rozell School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332 (marissa.connor, kion, crozell)@gatech.edu
Pseudocode	No	The paper describes methods and algorithms in prose and equations (e.g., Section 3.2, 3.3, and Appendix B), but does not contain any clearly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	Yes	1Code available at: https://github.com/Sensory-Information-Processing-Lab/manifold-autoencoder-extended.
Open Datasets	Yes	We work with three datasets: MNIST (Le Cun et al., 1998), Fashion MNIST (Xiao et al., 2017), and Celeb A (Liu et al., 2015).
Dataset Splits	Yes	We split the MNIST dataset into training, validation, and testing sets. The training set contains 50,000 images from the traditional MNIST training set. The validation set is made up of the remaining 10,000 images. The traditional MNIST testing set is used for our testing set.
Hardware Specification	Yes	Hyper-parameter tuning for all experiments was performed on the Georgia Tech Partnership for Advanced Computing Environment (PACE) clusters (PACE, 2017). Experiments were performed using a Nvidia Quadro RTX 6000. Runs training the CAE, and β-VAE on Celeb A were all run on a separate machine with a Nvidia TITAN RTX. Experiments were run on a machine with an Intel i7-6700 CPU with 4.00 GHz and a Nvidia TITAN RTX.
Software Dependencies	No	The paper mentions using "Py Torch implementation of the matrix exponential" and "automatic differentiation" but does not specify version numbers for PyTorch or any other software libraries used.
Experiment Setup	Yes	In all experiments, we followed the general training procedure put forth previously (Connor & Rozell, 2020) by separating the network training into three phases: the autoencoder training phase, the transport operator training phase, and the fine-tuning phase. We select training point pairs that are nearest neighbors in the feature space of the final, pre-logit layer of a Res Net-18 (He et al., 2015) classifier pretrained on Image Net (Russakovsky et al., 2015). After completely training the MAE, we fix the autoencoder network weights and transport operator weights and train the coefficient encoder network with the objective derived in Section 3.2. Additional details on the datasets, network architectures, and training procedure are available in the Appendix. Table 3: Training parameters for the transport operator training phase of the MNIST experiment [includes] batch size: 250, autoencoder training epochs: 300, latent space dimension (zdim): 10, M : 16, lrnet : 10 4, ζ : 0.1, γ : 2 10 6, initialization variance for Ψ: 0.05, number of restarts for coefficient inference: 1, nearest neighbor count: 5, latent scale: 30.