Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Quantum Doubly Stochastic Transformers

Authors: Jannis Born, Filip Skogh, Kahn Rhrissorrakrai, Filippo Utro, Nico Wagner, Aleksandros Sobczyk

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then train various flavors of doubly stochastic Transformers (see Figure 1) on more than ten object recognition datasets. In comparison to the Vi T [3] and Sinkformer [31], the QDSFormer shows competitive performance, consistently surpassing both.
Researcher Affiliation	Industry	Jannis Born Filip Skogh Kahn Rhrissorrakrai Filippo Utro Nico Wagner Aleksandros Sobczyk IBM Research Correspondence to: EMAIL
Pseudocode	No	The paper describes methods and operators (Sinkhorn's algorithm, Projection on the Birkhoff polytope, Qont OT, QR Decomposition) in prose, but does not present them in structured pseudocode or algorithm blocks.
Open Source Code	No	While the entire development codebase for this project unfortunately cannot be made public at this point, specific parts of the code are available upon justified request.
Open Datasets	Yes	We evaluate all Vi Ts on MNIST [59], Fashion MNIST [60], seven datasts from the Med MNIST benchmark [61] and a compositional task requiring multistep reasoning [25]. ... On the infrared spectral data of molecules from Alberts et al. [63] ... Table A4: Summary of datasets used, with references, licenses, and sizes.
Dataset Splits	Yes	The dataset is split into 60K (10K) training (validation) examples. ... All studied imaging datasets (MNIST, Fashion MNIST and seven types of Med MNIST datasets) come with predefined train/validation/test splits. ... On the infrared spectral data of molecules from Alberts et al. [63] we performed a 5-fold cross validation with 80%/20% train/test split.
Hardware Specification	Yes	Experiments were conducted on POWER8 infrastructure in Python 3.9 with Py Torch [69] 1.13.1 on machines with 16 cores of 32Gi B RDIMM DDR4 2.7 GHz. ... We used the three machines Torino (Heron R1, 133 qubits, error per layered gate: 1.3%), Brisbane (Eagle R3, 127 qubits, EPLG: 2.2%) and Cusco (Eagle R3, 127 qubits, EPLG: 6.8%)
Software Dependencies	Yes	Experiments were conducted on POWER8 infrastructure in Python 3.9 with Py Torch [69] 1.13.1
Experiment Setup	Yes	For experiments on MNIST, Fashion MNIST and the seven Med MNIST datasets, the Vi T was configured with a hidden dimension of 128 and an MLP dimension expansion factor of 1. The model was tested with 1 to 4 Transformer layers, each containing a single attention head. No dropout was applied, and the batch size was set to 100. For the optimizer, Adam [66] was used, and the learning rate schedule followed the setup in Sander et al. [31], with an initial learning rate of 5e-4, decreasing by a factor of 10 at epochs 31 and 45. For the more complex Eureka dataset, comprised of 56x56 RGB images, the hidden dimension was increased to 256, and a larger batch size of 512 was used. The MLP expansion factor was also doubled to 2. A cosine learning rate schedule was used with the optimizer Adam W [67]. The scheduler uses 5 warmup epochs with a warm-up learning rate of 1e-6, the decay rate is set to 0.1 and the minimum learning rate is 1e-5, the other parameters follows the default TIMM settings [68]. For the optimizer the weight decay is 0.05 and betas (0.9, 0.999).