Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Invariance & Causal Representation Learning: Prospects and Limitations

Authors: Simon Bing, Tom Hochsprung, Jonas Wahl, Urmi Ninad, Jakob Runge

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We establish impossibility results which show that invariance alone is insufficient to identify latent causal variables. Together with practical considerations, we use our results to reflect generally on the commonly used notion of identifiability in causal representation learning and potential adaptations of this goal moving forward. Given the theoretical impossibility results outlined in Sections 4.2 and 4.3 along with the practical limitations of learning nonlinear functions that generalize outside of the training data detailed in Section 4.4, the framework of utilizing invariance as a learning signal for causal representations displays an apparent mismatch between assumptions and the defined goal of identifiability.
Researcher Affiliation Collaboration Simon Bing EMAIL Technische Universität Berlin Tom Hochsprung German Aerospace Center (DLR), Institute of Data Science Technische Universität Berlin Jonas Wahl Technische Universität Berlin German Aerospace Center (DLR), Institute of Data Science Urmi Ninad Technische Universität Berlin German Aerospace Center (DLR), Institute of Data Science Jakob Runge Sca DS.AI Dresden/Leipzig, TU Dresden German Aerospace Center (DLR), Institute of Data Science Technische Universität Berlin
Pseudocode No The paper primarily presents mathematical proofs and theoretical analyses (e.g., Lemma 1, Lemma 2, Theorem 1, Theorem 2) and does not include any clearly labeled pseudocode or algorithm blocks. The proofs are presented in a structured mathematical format, not as programming pseudocode.
Open Source Code No The paper does not contain any explicit statements about releasing code, nor does it provide a link to a code repository for the methodology described.
Open Datasets No The paper is theoretical in nature and does not describe or use any specific datasets for empirical evaluation, nor does it provide access information for any open datasets. The text refers to a 'data generating process' for theoretical modeling, but not actual experimental data.
Dataset Splits No The paper is a theoretical work that does not involve empirical experiments with datasets; therefore, no dataset splits are mentioned.
Hardware Specification No As a theoretical paper, there are no experimental results or computational benchmarks that would require the specification of hardware used.
Software Dependencies No The paper does not detail any experimental implementation or computational work; thus, no software dependencies with version numbers are provided.
Experiment Setup No Being a theoretical study, the paper does not describe any practical experiments or computational models that would require detailing an experimental setup or hyperparameters.