Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models

Authors: Hafez Ghaemi, Eilif B. Muller, Shahab Bakhtiari

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, seq-JEPA demonstrates strong performance on both equivarianceand invariance-demanding downstream tasks without sacrificing one for the other. Furthermore, it excels at tasks that inherently require aggregating a sequence of observations, such as path integration across actions and predictive learning across eye movements.
Researcher Affiliation	Academia	Hafez Ghaemi1,2,3 Eilif B. Muller1,2,3 Shahab Bakhtiari1,2 1Université de Montréal, 2Mila Quebec AI Institute, 3CHU Sainte-Justine Correspondence to EMAIL Equal Contribution
Pseudocode	No	The paper describes the architecture and training procedure in text and mathematical equations, but does not include any specific section or figure labeled 'Pseudocode' or 'Algorithm'.
Open Source Code	Yes	Project Page Code
Open Datasets	Yes	STL10 Saliency Image Net-1k Saliency 3DIEBench-OOD We use CIFAR100 and Tiny Image Net for experiments in this setup, and follow Equi Mod s augmentation protocol [Devillers and Lefort, 2022]. The 3DIEBench dataset [Garrido et al., 2023] is designed to evaluate representational invariance and equivariance.
Dataset Splits	Yes	The training set consists of 100000 Image Net-1k images from 200 classes (500 for each class) downsized to 64 64. The validation set has 50 images per class. For linear probing, we follow a common SSL protocol and train a linear classifier on top of frozen representations with a batch size of 256 for 300 epochs.
Hardware Specification	Yes	Each experiment was run on a single NVIDIA A100 GPU with 40GB of accelerator RAM.
Software Dependencies	No	We used the Py Torch framework for training all models. (No specific version numbers are provided for PyTorch or other software dependencies.)
Experiment Setup	Yes	All models are trained from scratch with a batch size of 512. We use 1000 epochs for 3DIEBench and 2000 epochs for other datasets to obtain asymptotic performance. We use Adam W for models with transformer projectors (including seq-JEPA)... with default β1 and β2, a weight decay of 0.001, and a learning rate of 4 10 4 with a linear warmup for 20 epochs starting from 10 5 followed by a cosine decay back to 10 5.