Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Class-Weighted Classification: Trade-offs and Robust Approaches
Authors: Ziyu Xu, Chen Dan, Justin Khim, Pradeep Ravikumar
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we empirically demonstrate the efficacy of LCVa R and LHCVa R on improving class conditional risks. |
| Researcher Affiliation | Academia | 1Machine Learning Department, Carnegie Mellon University, Pennsylvania, United States 2Computer Science Department, Carnegie Mellon University, Pennsylvania, United States. |
| Pseudocode | No | No explicitly labeled pseudocode or algorithm block is present in the main text provided. The paper mentions pseudocode might be in the appendix, which is not available. |
| Open Source Code | Yes | Code for reproducing the results in this section can be found at https://www.github.com/neilzxu/ robust_weighted_classification. |
| Open Datasets | Yes | Real World Datasets We also experiment on the Covertype dataset taken from the UCI dataset repository (Dua & Graff, 2017). |
| Dataset Splits | No | The paper mentions '10,000 data points for both train and test sets' for synthetic data and a 'train-test split' for Covertype, but does not specify a validation set or explicit train/validation/test split percentages. |
| Hardware Specification | No | No specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions training a logistic regression model with gradient descent on a cross entropy loss, but does not provide specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9'). |
| Experiment Setup | Yes | LCVa R The empirical formulation optimizes the dual formulation, in which α is a hyperparameter: \LCVa Rα(f) = min λ R i=1 bpi( b Ri(f) λ)+ + λ. To reduce the number of hyperparameters to only c (0, 1] and κ (0, ), we calculate αi as follows: α(κ,c) i = c bpi 1/κ Pk j=1 bpj 1/κ. We train a logistic regression model with gradient descent on a cross entropy loss, which acts as a convex surrogate loss for zero-one risk. |