Neural Isometries: Taming Transformations for Equivariant ML
Authors: Thomas Mitchel, Michael J. Taylor, Vincent Sitzmann
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally validate two principle claims regarding the efficacy and applicability of our approach: Neural Isometries recover a general-purpose latent space in which challenging symmetries in the observation space can be reduced to compact, tractable maps in the latent space. We show that this can be exploited by simple isometry-equivariant networks to achieve results on par with leading hand-crafted equivariant networks in tasks with complex non-linear symmetries. In this section, we provide empirical evidence through experiments that NIso 1) recovers a general-purpose latent space that can be exploited by isometry-equivariant networks to handle challenging symmetries (5.1, 5.2); and 2) NIso encodes information about transformations in world space through the construction of isometric maps in the latent space from which geometric quantities such as camera poses can be directly regressed (5.3). |
| Researcher Affiliation | Collaboration | Thomas W. Mitchel Play Station tommy.mitchel@sony.com Michael Taylor Play Station mike.taylor@sony.com Vincent Sitzmann MIT sitzmann@mit.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. Figure 2 provides a high-level overview diagram, not an algorithm. |
| Open Source Code | Yes | Code and experiments are available at https://github.com/vsitzmann/neural-isometries. |
| Open Datasets | Yes | The paper uses well-known public datasets such as "Image Net [43]", "hom NIST dataset [6]", "augmented SHREC 11 dataset [7, 46]", and "CO3Dv2 dataset [49]", all of which are cited. |
| Dataset Splits | Yes | For Homography-Perturbed MNIST: "pre-training is performed by randomly sampling homographies... applied to the elements of the standard MNIST training set... trained only on the original (unperturbed) MNIST training set and evaluated on the perturbed test set." For Conformal Shape Classification: "train and evaluation splits are randomly generated by selecting respectively 10 and 4 of the 20 sets of conformally augmented shapes per class." For CO3Dv2: "An evaluation set is formed by withholding 10% of said trajectories, with the rest used for training." |
| Hardware Specification | Yes | All experiments were performed on a single NVIDIA A6000 GPU with 48 GB of memory. |
| Software Dependencies | No | The paper mentions the "Adam W optimizer [53]" but does not specify version numbers for any programming languages (e.g., Python), deep learning frameworks (e.g., PyTorch, TensorFlow), or other significant libraries used for the implementation. |
| Experiment Setup | Yes | The paper provides specific hyperparameters and training settings for each experiment. For Homography-Perturbed MNIST: "trained with respect to the composite loss with α = 0.5 and β = 0.1. The autoencoders are trained without spectral dropout for 50, 000 steps with a batch size of 16 using the Adam W optimizer with a weight decay of 10 4. The learning rate follows a schedule consisting of a 2,000 step warm up from 0.0 to 5 10 4 and afterwards decays to 5 10 5 via cosine annealing." |