Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Learning Identity-Preserving Transformations on Data Manifolds
Authors: Marissa Catherine Connor, Kion Fallah, Christopher John Rozell
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on MNIST and Fashion MNIST highlight our model s ability to learn identity-preserving transformations on multi-class datasets. Additionally, we train on Celeb A to showcase our model s ability to learn semantically meaningful transformations on complex datasets in an unsupervised manner. |
| Researcher Affiliation | Academia | Marissa Connor , Kion Fallah , Christopher J. Rozell School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332 (marissa.connor, kion, crozell)@gatech.edu |
| Pseudocode | No | The paper describes methods and algorithms in prose and equations (e.g., Section 3.2, 3.3, and Appendix B), but does not contain any clearly labeled pseudocode or algorithm blocks with structured steps. |
| Open Source Code | Yes | 1Code available at: https://github.com/Sensory-Information-Processing-Lab/manifold-autoencoder-extended. |
| Open Datasets | Yes | We work with three datasets: MNIST (Le Cun et al., 1998), Fashion MNIST (Xiao et al., 2017), and Celeb A (Liu et al., 2015). |
| Dataset Splits | Yes | We split the MNIST dataset into training, validation, and testing sets. The training set contains 50,000 images from the traditional MNIST training set. The validation set is made up of the remaining 10,000 images. The traditional MNIST testing set is used for our testing set. |
| Hardware Specification | Yes | Hyper-parameter tuning for all experiments was performed on the Georgia Tech Partnership for Advanced Computing Environment (PACE) clusters (PACE, 2017). Experiments were performed using a Nvidia Quadro RTX 6000. Runs training the CAE, and Ξ²-VAE on Celeb A were all run on a separate machine with a Nvidia TITAN RTX. Experiments were run on a machine with an Intel i7-6700 CPU with 4.00 GHz and a Nvidia TITAN RTX. |
| Software Dependencies | No | The paper mentions using "Py Torch implementation of the matrix exponential" and "automatic differentiation" but does not specify version numbers for PyTorch or any other software libraries used. |
| Experiment Setup | Yes | In all experiments, we followed the general training procedure put forth previously (Connor & Rozell, 2020) by separating the network training into three phases: the autoencoder training phase, the transport operator training phase, and the fine-tuning phase. We select training point pairs that are nearest neighbors in the feature space of the final, pre-logit layer of a Res Net-18 (He et al., 2015) classifier pretrained on Image Net (Russakovsky et al., 2015). After completely training the MAE, we fix the autoencoder network weights and transport operator weights and train the coefficient encoder network with the objective derived in Section 3.2. Additional details on the datasets, network architectures, and training procedure are available in the Appendix. Table 3: Training parameters for the transport operator training phase of the MNIST experiment [includes] batch size: 250, autoencoder training epochs: 300, latent space dimension (zdim): 10, M : 16, lrnet : 10 4, ΞΆ : 0.1, Ξ³ : 2 10 6, initialization variance for Ξ¨: 0.05, number of restarts for coefficient inference: 1, nearest neighbor count: 5, latent scale: 30. |