Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Calibrated Data-Dependent Constraints with Exact Satisfaction Guarantees
Authors: Songkai Xue, Yuekai Sun, Mikhail Yurochkin
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In Section 4, we validate our theory by simulating a resource-constrained newsvendor problem. In Section 5, we demonstrate the efficacy of our method by using it to train an algorithmically fair income classifier. In Figure 1, we plot frequencies of constraint satisfaction for each constraint and both constraints, all of which are averaged over 1000 replicates. In Figure 2, we have line plots for frequency of constraint satisfaction and box plots for classification error rate, all of which are summarized over 100 replicates. |
| Researcher Affiliation | Collaboration | Songkai Xue Department of Statistics University of Michigan EMAIL Yuekai Sun Department of Statistics University of Michigan EMAIL Mikhail Yurochkin IBM Research MIT-IBM Watson AI Lab EMAIL |
| Pseudocode | Yes | Algorithm 1 Dual ascent algorithm for (1.3) |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] |
| Open Datasets | Yes | using the Adult dataset from UCI [13]. [13] Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml. |
| Dataset Splits | No | The paper mentions 'training data' in theoretical discussions and states using the Adult dataset for experiments, but it does not specify explicit training/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) for its own experiments in the main text. |
| Hardware Specification | No | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] |
| Software Dependencies | No | The paper mentions using 'standard stochastic optimization algorithms' and 'logistic regression model for classification', but it does not specify any software names with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific solver versions) that would be needed to reproduce the experiment. |
| Experiment Setup | No | The paper states using a 'logistic regression model for classification' and notes the 'nominal probability' values tested. However, it does not explicitly provide concrete hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations for the experiments in the main text. |