Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Double-Weighting for Covariate Shift Adaptation
Authors: José I. Segovia-Martín, Santiago Mazuelas, Anqi Liu
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This section shows experimental results for the proposed approach in comparison with existing methods on synthetic and real datasets. |
| Researcher Affiliation | Academia | 1Basque Center for Applied Mathematics (BCAM), Bilbao, Spain 2IKERBASQUE-Basque Foundation for Science 3CS department, Whiting School of Engineering, Johns Hopkins University, Baltimore, Maryland, USA. |
| Pseudocode | Yes | Algorithm 1 The proposed algorithm: DW-GCS Input: Training samples (x1, y1), (x2, y2), . . . , (xn, yn) Testing instances xn+1, xn+2, . . . , xn+t, D Output: Weights ˆβ and ˆα Classifier parameters µ , Minimax risk R(U) 1: ˆβ, ˆα solution of (25) 2: τ 1 n Pn i=1 ˆβ(i)Φ(xi, yi) 3: λ solution of (31) 4: µ solution of (30) using (12) for 0-1-loss, and (13) for log-loss 5: R(U) τ Tµ + 1 t Pt i=1 ϕℓ(µ , xn+i, ˆα(i)) + λT|µ | |
| Open Source Code | Yes | The source code for the methods presented is publicly available in the library MRCpy (Bondugula et al., 2023) and the experimental setup in https://github.com/MachineLearningBCAM/MRCs-for-Covariate-Shift-Adaptation. |
| Open Datasets | Yes | For the experiments in Section 6, we have considered four binary classification datasets, available in the UCI repository (Dua & Graff, 2017)... In addition, we use the dataset News20groups that is intrinsically affected by covariate shift (Zhang et al., 2013). |
| Dataset Splits | No | The paper mentions "training and testing samples" and "100 random partitions" but does not explicitly specify a distinct validation set split (e.g., percentages, counts, or k-fold cross-validation setup) for model training or hyperparameter tuning in a reproducible manner. It states "standard cross-validation is not valid under covariate shift" for its approach. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions the library "MRCpy" but does not provide specific version numbers for any software dependencies, including MRCpy itself or other programming languages/libraries used. |
| Experiment Setup | Yes | For the results obtained using the flattening method in (Shimodaira, 2000) and the Ru LSIF method in (Yamada et al., 2011) we considered the hyperparameter γ = 0.5, which is the default value used in those papers. The table also shows the parameter σ used in the computation of the kernel matrix K for the Ru LSIF, KMM and DW-KMM methods, which is determined using the common heuristic based on nearest neighbors with K = 50, as is done in (Wen et al., 2014). and Specifically, we select the value of D to achieve the lowest minimax risk over a certain range D ≥ 1. and The second hyperparameter λ is determined solving min p,λ 1Tλ s.t. P y∈Y p(y|xn+i)Φα(xn+i, y) τ + λ and P y∈Y p(y|xn+i) = 1/t for i = 1, . . . , t (31) |