Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression

Authors: Megh Shukla, Aziz Shameem, Mathieu Salzmann, Alexandre Alahi

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments over a wide range of synthetic and real datasets demonstrate that the proposed 2-Wasserstein bound coupled with pseudo-label annotations results in a computationally cheaper yet accurate deep heteroscedastic regression. We perform extensive experiments across a wide range of synthetic and real world settings. Our results show that the proposed self-supervised framework is (1) computationally cheaper and (2) retains accuracy with respect to the state-of-the-art. Finally, our experiments on human pose also highlight the possibility of using self-supervised and unsupervised objectives to further improve optimization.
Researcher Affiliation	Academia	Megh Shukla EPFL Aziz Shameem IIT Bombay Mathieu Salzmann SDSC and EPFL Alexandre Alahi EPFL
Pseudocode	Yes	Algorithm 1: Covariance Pseudo-Label
Open Source Code	Yes	Our code is available at our project page. We make our code available on https://deep-regression.github.io. The code comes complete with a docker image and documentation for reproducibility.
Open Datasets	Yes	We use the same setup as Shukla et al. (2024) which experiments on univariate sinusoidals, synthetic multivariate data, UCI Machine Learning repository (Markelle Kelly) and 2D human pose estimation (Andriluka et al., 2014; Johnson & Everingham, 2010; 2011). We perform our experiments on the MPII (Andriluka et al., 2014) and LSP/LSPET (Johnson & Everingham, 2010; 2011) datasets, with the latter focusing on poses related to sports.
Dataset Splits	No	For each of the twelve datasets, 25% of the features are randomly selected as inputs, with the remaining 75% used as multivariate targets during run-time. We conduct five trials and report the mean. We merge the MPII and LSP-LSPET datasets to increase the sample size. The paper describes a feature split for input/target for UCI datasets and mentions merging human pose datasets, but does not provide explicit percentages, counts, or clear citations for train/validation/test splits of data instances for reproducing the data partitioning for evaluation.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running the experiments are mentioned in the paper. The acknowledgement section mentions 'RCP, EPFL for their support in compute' but does not specify the hardware.
Software Dependencies	No	The paper states that the code comes with a 'docker image and documentation for reproducibility' but does not explicitly list specific software dependencies with their version numbers (e.g., Python, PyTorch, CUDA versions) within the main text or appendices.
Experiment Setup	Yes	For all our experiments, we set the nearest neighbors hyperparameter in the pseudo-label algorithm to ten times the dimensionality of the target. All the methods are trained using the Adam W optimizer, which implicitly imposes a weight decay of 0.01 on the parameters. For synthetic univariate data, we use four hidden layers with a latent dimension of fifty. For multivariate synthetic data, we use ten hidden layers with a latent dimension of that is the dimensionality of the input squared. The pose estimator is trained using the Adam optimizer with a Reduce LROn Plateau learning rate scheduler for 100 epochs, with the learning rate set to 1e-3. Two augmentations, Shift+Scale+Rotate and horizontal flip, are applied.