Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Equivariance by Contrast: Identifiable Equivariant Embeddings from Unlabeled Finite Group Actions

Authors: Tobias Schmidt, Steffen Schneider, Matthias Bethge

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In 4โ€“6, we evaluate EbC on synthetic and structured vision datasets, including finite product groups G := (Rm โ‹Š Zn โ‹Š Zn) and non-abelian groups such as O(n) and GL(n). EbC achieves high-fidelity equivariant embeddings across diverse settings. We apply EbC on a variety of group learning settings, summarized in Table 1.
Researcher Affiliation Academia Tobias Schmidt1,2, Steffen Schneider1,2,3 and Matthias Bethge3 1Institute of Computational Biology, Helmholtz Munich 2Munich Center for Machine Learning (MCML) 3Tรผbingen AI Center Co-corresponding authors: EMAIL, EMAIL.
Pseudocode No The paper describes the algorithm and objective function in Section 2, accompanied by Figure 2 and Figure 3, which illustrate the approach and training objective. However, there is no explicitly labeled 'Pseudocode' or 'Algorithm' block or figure presenting structured steps in a code-like format.
Open Source Code Yes The code for our paper is available at https://github.com/dynamical-inference/ebc. During the phase of double-blind review, we provided an anonymized version of the codebase. For additional pointers, see Appendix B.
Open Datasets Yes We leverage infinite dSprites [8, idSprites], an extension of dSprites [33] for validation on more diverse visual data. We used the infinite dSprites (idSprites; 8) dataset available at https://github.com/sbdzdz/idsprites in the pip-installable version v1.0.1 (MIT License).
Dataset Splits Yes We perform 80/10/10 train/valid/test splits. Before training on the synthetic group dataset, we perform a hold-out split in terms of the group actions gi into a training, validation, and test dataset such that we get a 80/10/10 split of the group actions. For idSprites, we randomly sample approx. 20k group actions during training and sample another approx. 90k group actions for validation and test respectively.
Hardware Specification Yes Experiments were carried out on a compute cluster with A100 cards with 40Gb VRAM.
Software Dependencies No We fit the implicit representation ห†R(Y , Y ) using the gels least squares solver in PyTorch. To sample from the O(n) and SO(n) group, we sample from the Haar distribution [34], making use of the SciPy [46] package. We train the model for 20k with Adam [25] with learning rate 10โˆ’3.
Experiment Setup Yes We use a three layer MLP with 512 hidden units for ฯ•. We fit the implicit representation ห†R(Y , Y ) using the gels least squares solver in PyTorch. By default, we use 84 sample pairs for idSprites and 12 for the synthetic data in Y . Each batch has 1024 positive and 16k negative samples. We train the model for 20k with Adam [25] with learning rate 10โˆ’3.