Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Measure-Theoretic Anti-Causal Representation Learning

Authors: Arman Behnam, Binghui Wang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on synthetic and real-world medical datasets demonstrate that ACIA consistently outperforms state-of-the-art methods in both accuracy and invariance metrics. Furthermore, our theoretical results establish tight bounds on performance gaps between training and unseen environments, confirming the efficacy of our approach for robust anti-causal learning.
Researcher Affiliation	Academia	Arman Behnam Department of Computer Secience Illinois Institute of Technology Chicago, Illinois, USA EMAIL Binghui Wang Department of Computer Secience Illinois Institute of Technology Chicago, Illinois, USA EMAIL
Pseudocode	Yes	The complete ACIA algorithm include three components (see details of Alg.1-Alg.3). i) Alg.1 constructs the low-level representation ϕL by building causal spaces for each environment ei E and their product spaces, computing causal kernels KS, and ultimately outputting the low-level causal dynamics ZL = X, Q, KL as established in Thm.3. ii) Alg.2 takes the set of low-level representations ϕL = {ZLk}K k=1 outputted by Alg.1 and constructs the high-level abstraction ZH = VH, KH by integrating kernels across the low-level representation domain DZL as derived in Thm.4. iii) Alg.3 integrates both algorithms by taking the outputs ϕL and ϕH as inputs, implementing the core optimization procedure defined in Eqn.5.
Open Source Code	Yes	Code is available at https://github.com/Arman Behnam/ACIA.
Open Datasets	Yes	We test four datasets in anti-causal settings: Colored MNIST (CMNIST), Rotated MNIST (RMNIST), Ball Agent [9], and Camelyon17 [6].
Dataset Splits	Yes	Table 9: Dataset configuration details and their properties. Dataset Type Environments Dimensionality Label Space Spurious Corr. Train Size Test Size CMNIST Synthetic 2 (e1, e2) 28x28x3 {0,...,9} Color-digit 60,000 10,000 RMNIST Synthetic 2 train (15 , 75 ), 3 test (30 , 45 , 60 ) 28x28x3 {0,...,9} Rotation-digit 60,000 10,000 Ball Agent Synthetic 4 balls with interventions 64x64x3 [0, 1]2n Coord-coupling 15,000 5,000 Camelyon17 Real 3 train (hospitals 0-2), 2 test (hospitals 3-4) 96x96x3 {0,1} Hospital-stain 50,916 33,944
Hardware Specification	Yes	Table 10 reports the runtime on a laptop with a single GPU (3.3 GHz, 32 GB RAM), using precision ϵ = 0.01 and failure probability δ = 0.05.
Software Dependencies	No	For all datasets, the batch size is 32, the optimizer is Adam, the learning rate is 1e-4, and we use early stopping for Camelyon17 to avoid overfitting in the results.
Experiment Setup	Yes	Hyperparameter setting: For all datasets, the batch size is 32, the optimizer is Adam, the learning rate is 1e-4, and we use early stopping for Camelyon17 to avoid overfitting in the results. In our regularier, we chose λ1 as 0.1/ batch_size ( 0.0177) for CMNIST, RMNIST, and Ball Agent, and chose 0.5 for Camelyon17. In addition, we chose λ2 as 0.5/ batch_size ( 0.0884) for CMNIST, RMNIST, and Ball Agent, and chose 0.1 for Camelyon17.