reproducibilityindex.ai

Mutual Transfer Learning for Massive Data

Authors: Ching-Wei Cheng, Xingye Qiao, Guang Cheng

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulated and real examples are analyzed to illustrate the usefulness of the proposed method. The empirical performance of the proposed approach is examined through simulation studies. The ﬁnite-sample properties of the proposed approach are evaluated in Section 5 via simulation experiments. Section 6 investigates the n Clim Div database to illustrate the practical usefulness of the proposed method.
Researcher Affiliation	Academia	1Department of Statistics, Purdue University 2Department of Mathematical Sciences, Binghamton University.
Pseudocode	No	The paper describes the ADMM algorithm and its derivation in Section 3.1, but it does not provide a structured pseudocode block or algorithm box.
Open Source Code	No	The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	NOAA s n Clim Div database1 were analyzed to demonstrate MTL method s practical usefulness. The monthly average temperature is the response of interest. Available at ftp://ftp.ncdc.noaa.gov/pub/data/cirs/climdiv/.
Dataset Splits	Yes	data in 1895-2000 was used for training and 2001-2016 for testing. Recall that we did not choose λ to minimize the cross-validation prediction error, but used BIC in order to get a parsimonious model.
Hardware Specification	No	The paper does not specify any hardware details such as GPU/CPU models, memory, or specific computing environments used for running the experiments.
Software Dependencies	No	The paper mentions statistical software and methods (e.g., ADMM, MCP, SCAD, TLP) and general computational concepts, but it does not provide specific version numbers for any programming languages, libraries, or software packages used for implementation.
Experiment Setup	Yes	Table 1 summarizes nine simulation settings (with p = 5 global features and q = 3 heterogeneous features), and each has 100 replications, where the signal-to-noise ratio (SNR) is deﬁned in Section S.10. The largest total sample size is 307,200. For simplicity, we consider equal unit sizes ni n for i = 1, . . . , M. We let the number of units in each subgroup to be (M1, . . . , MS) = 1S + Multinomial(M S, 1S/S). The coordinates of β0 were generated from Uniform( 2, 2) independently. To mimic the different coefﬁcient values for the heterogeneous features between subgroups, we generated α0 = (α 1,0, . . . , α S,0) , where αs,0 = (αs,0,1, αs,0,2, αs,0,3) , in a way to guarantee the minimal signal condition... Moreover, ui and εi follows N(0, 0.3I) and N(0, I), respectively. Finally, Y was generated from the oracle model.