Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
A Unifying Perspective on Multi-Calibration: Game Dynamics for Multi-Objective Learning
Authors: Nika Haghtalab, Michael Jordan, Eric Zhao
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we study the empirical performance of multicalibration algorithms on the UCI Adult Income dataset [26], a real-world dataset for predicting individuals incomes based on the US Census. |
| Researcher Affiliation | Academia | Nika Haghtalab, Michael I. Jordan, and Eric Zhao University of California, Berkeley EMAIL |
| Pseudocode | Yes | Algorithm 1 Non-Deterministic Multicalibration Algorithm (Theorem 4.1) |
| Open Source Code | Yes | The source code for these experiments is included in the repository https://github.com/ericzhao28/multicalibration. |
| Open Datasets | Yes | We conduct three sets of experiments to evaluate different batch multicalibration algorithms. The three sets of experiments we conduct correspond to three datasets: the UCI Adult Income dataset [26], a real-world dataset for predicting individuals incomes based on the US Census, the UCI Bank Marketing dataset [32], a dataset for predicting whether an individual will subscribe to a bank s term deposit, and the Dry Bean Dataset [27], a dataset for predicting a dry bean s variety. |
| Dataset Splits | No | The paper mentions 'random 80-20 train/test splits' but does not explicitly state a separate validation split or its size/percentage. |
| Hardware Specification | Yes | All experiments were performed on a 2021 Mac Book Pro, with a M1 Pro chip. |
| Software Dependencies | No | The paper does not explicitly provide specific software dependencies with version numbers (e.g., Python, PyTorch, or other libraries). |
| Experiment Setup | Yes | Learning rate decay is tuned on the training set by sweeping over [0.8, 0.85, 0.9, 0.95] for the learner and [0.9, 0.95, 0.98, 0.99] for the adversary. |