reproducibilityindex.ai

Boosting with Multiple Sources

Authors: Corinna Cortes, Mehryar Mohri, Dmitry Storcheus, Ananda Theertha Suresh

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also report the results of several experiments with our algorithm demonstrating that it outperforms natural baselines on multi-source text-based, image-based and tabular data. We further present an extension of our algorithm to the federated learning scenario and report favorable experimental results for that setting as well.
Researcher Affiliation	Collaboration	Corinna Cortes Google Research New York, NY 10011 corinna@google.com Mehryar Mohri Google & Courant Institute New York, NY 10012 mohri@google.com Dmitry Storcheus Courant Institute & Google New York, NY 10012 dstorcheus@google.com Ananda Theertha Suresh Google Research New York, NY 10011 theertha@google.com
Pseudocode	Yes	The pseudocode of our algorithm, MULTIBOOST, is provided in Figure 1.
Open Source Code	No	The paper does not provide any explicit statement or link indicating the availability of open-source code for the described methodology.
Open Datasets	Yes	Datasets and preprocessing steps used are described below with additional dataset details provided in Appendix H. Note, that all datasets are public and do not contain any personal identiﬁers or offensive information.
Dataset Splits	Yes	The errors and their standard deviations are reported based on 10-fold cross validation. Each source Sk is independently split into 10 folds S1 k, . . . , S10 k . For the i-th cross-validation step, the test set is {Si 1, . . . , Si p}, while the rest is used for training.
Hardware Specification	Yes	The experiments were performed on Linux and Mac workstations with Quad-Core Intel Core i7 2.9 GHz and Intel Xeon 2.20 GHz respectively.
Software Dependencies	No	The paper does not provide specific version numbers for ancillary software or libraries used in the experiments.
Experiment Setup	Yes	Our study is restricted to learning an ensemble of decision stumps Hstumps using the exponential surrogate loss Φ(u) = e u. ... We used T = 100 boosting steps for all benchmarks. ... To estimate the probabilities Q(k\| ) for k [p], we assigned the label k to each sample from domain k and used multinomial logistic regression. ... Alternatively, for some experiments we used line search with 100 steps.