Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Characterizing the Trade-off in Invariant Representation Learning

Authors: Bashir Sadeghi, Sepehr Dehdashtian, Vishnu Boddeti

TMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also numerically quantify the trade-off on representative problems and compare them to those achieved by baseline IRep L algorithms. Code is available at https://github. com/human-analysis/tradeoff-invariant-representation-learning. [...] In this section, we numerically quantify our K TOpt through the closed-form solution for the encoder obtained in Section 5 on an illustrative toy example and two real-world datasets, Folktables and Celeb A.
Researcher Affiliation	Academia	Bashir Sadeghi EMAIL Sepehr Dehdashtian EMAIL Vishnu Naresh Boddeti EMAIL Department of Computer Science and Engineering Michigan State University
Pseudocode	No	The paper describes mathematical derivations and theoretical framework for the optimization problem and its solution, but does not present any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github. com/human-analysis/tradeoff-invariant-representation-learning.
Open Datasets	Yes	We numerically quantify our K TOpt through the closed-form solution for the encoder obtained in Section 5 on an illustrative toy example and two real-world datasets, Folktables (Ding et al., 2021) and Celeb A (Liu et al., 2015).
Dataset Splits	Yes	We sample 18, 000 instances from p X,Y,S independently and split these samples equally into training, validation, and testing partitions. [...] We randomly split the data into training (70%), validation (15%), and testing (15%) partitions. [...] Celeb A dataset (Liu et al., 2015) contains 202, 599 face images of 10, 177 different celebrities with standard training, validation, and testing splits.
Hardware Specification	No	The paper does not provide specific details about the hardware (GPU, CPU, memory, etc.) used for running the experiments.
Software Dependencies	No	The paper mentions using MLPs, AdamW optimizer, and stochastic gradient descent, but it does not specify any software versions (e.g., Python, PyTorch, TensorFlow, or specific library versions) for these components.
Experiment Setup	Yes	For all methods, we pick different values of λ (100 λs for the Gaussian toy example and 70 λs for Folktables and Celeb A datasets) between zero and one for obtaining the utility-invariance trade-off. [...] We optimize the regularization parameter γ in the disentanglement set (10) by minimizing the corresponding target losses over γs in {10 6, 10 5, 10 4, 10 3, 10 2, 10 1, 1} on validation sets. [...] The final RFF dimensionality is 100 for the Gaussian dataset, 5000 for the Folktables dataset, and 1000 for the Celeb A dataset. [...] We use a batch size of 500 for Gaussian data; and 128 for Folktables and Celeb A. Then, the corresponding learning rates are optimized over 10 2, 10 3, 5 10 4, 10 4, 10 5 by minimizing the target loss on the corresponding validation sets.