reproducibilityindex.ai

A Discriminative Technique for Multiple-Source Adaptation

Authors: Corinna Cortes, Mehryar Mohri, Ananda Theertha Suresh, Ningshan Zhang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments with real-world applications further demonstrate that our new discriminative MSA algorithm outperforms the previous generative solution as well as other domain adaptation baselines.
Researcher Affiliation	Collaboration	1Google Research, New York, NY; 2Courant Institute of Mathematical Sciences, New York, NY; 3Hudson River Trading, New York, NY.
Pseudocode	No	The paper describes algorithms and optimization problems in text but does not include formal pseudocode blocks or algorithm listings.
Open Source Code	No	The paper does not provide any explicit statement or link regarding the availability of its source code.
Open Datasets	Yes	We experimented with our DMSA technique on the same datasets as those used in (Hoffman et al., 2018), as well as with the UCI adult dataset, and compared its performance with several baselines, including GMSA. Sentiment analysis. ... sentiment analysis dataset (Blitzer et al., 2007), which consists of product review text and rating labels taken from four domains: books (B), dvd (D), electronics (E), and kitchen (K), with 2,000 samples for each domain. Digit dataset. To evaluate the DMSA solution under the probability model, we considered a digit recognition task consisting of three datasets: Google Street View House Numbers (SVHN), MNIST, and USPS. Adult dataset. We also experimented with the UCI adult dataset (Blake, 1998) Ofﬁce dataset. We also carried out experiments on the visual adaptation ofﬁce dataset (Saenko et al., 2010).
Dataset Splits	Yes	We randomly split the 2,000 samples per domain into 1,600 train and 400 test samples for each domain, and learn the base predictors, domain classiﬁer, density estimations, and parameter z for both MSA solutions on all available training samples.
Hardware Specification	No	The paper does not specify any hardware details such as GPU/CPU models or types of machines used for the experiments.
Software Dependencies	No	The paper mentions the use of certain techniques and toolkits (e.g., "logistic regression", "support vector regression", "sklearn toolkit") but does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	We used the full training sets per domain to train the source model, and used 6,000 samples per domain to learn the domain classiﬁer. Finally, for our DC-programming algorithm, we used a 1,000 image-label pairs from each domain, thus a total of 3,000 labeled pairs to learn the parameter z.