ReTaSA: A Nonparametric Functional Estimation Approach for Addressing Continuous Target Shift

Authors: Hwanwoo Kim, Xin Zhang, Jiwei Zhao, Qinglong Tian

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The effectiveness of the proposed method has been demonstrated with extensive numerical studies on synthetic and real-world datasets. 5 EXPERIMENTS In this section, we present the results of numerical experiments that illustrate the performance of our proposed method in addressing the continuous target shift problem. We will first study with the synthetic data and then apply the methods to two real-world regression problems.
Researcher Affiliation Collaboration Hwanwoo Kim1 , Xin Zhang2 , Jiwei Zhao3, Qinglong Tian4 University of Chicago1, Meta, Inc.2, University of Wisconsin-Madison3, University of Waterloo4
Pseudocode No The paper describes the estimation process mathematically and with equations (e.g., equation 7) but does not provide a formal pseudocode block or algorithm listing.
Open Source Code No The paper does not provide an explicit statement about open-source code availability for the methodology described, nor does it provide a direct link to a code repository.
Open Datasets Yes The SOCR dataset contains 1035 records of heights, weights, and position information for some current and recent Major League Baseball (MLB) players. In our study, we conducted 20 random trials. In each trial, we treat all outfielder players as the testing data and randomly select 80% of the players with the other positions as the training source data. (http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data_MLB_Heights Weights). The second real-world task under investigation is the Szeged temperature prediction problem. This dataset comprises weather-related data recorded in Szeged, Hungary, from 2006 to 2016. (https://www.kaggle.com/datasets/budincsevity/szeged-weather)
Dataset Splits Yes We set the size of the target testing data as 0.8 of this study s training source data size, i.e., m = 0.8 n unless otherwise specified. In each trial, we treat all outfielder players as the testing data and randomly select 80% of the players with the other positions as the training source data. In each trial, we utilize data from January to October as the training source dataset, while data from November and December constitute the testing dataset.
Hardware Specification Yes Our experiments are conducted on a Mac-Book Pro equipped with a 2.9 GHz Dual-Core Intel Core I5 processor and 8GB of memory.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks) used in the experiments.
Experiment Setup Yes We relegated the experimental details (e.g., hyperparameter tuning) and extensive experiments with high-dimensional datasets to Section C in the supplementary material. As for the target prediction, we adopt the polynomial regression model with degrees as 5. We conducted all experiments with 50 replications.